[v3,22/22] builtin/diff: free symmetric diff members

Message ID	31e38ba4e150c9bc9e3aa1073869881ccba9035e.1723540931.git.ps@pks.im (mailing list archive)
State	Superseded
Headers	show Received: from fout3-smtp.messagingengine.com (fout3-smtp.messagingengine.com [103.168.172.146]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1D17B19A29A for <git@vger.kernel.org>; Tue, 13 Aug 2024 09:32:20 +0000 (UTC) Feedback-ID: i197146af:Fastmail Date: Tue, 13 Aug 2024 11:32:16 +0200 From: Patrick Steinhardt <ps@pks.im> To: git@vger.kernel.org Cc: James Liu <james@jamesliu.io>, karthik nayak <karthik.188@gmail.com>, Phillip Wood <phillip.wood123@gmail.com>, Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com> Subject: [PATCH v3 22/22] builtin/diff: free symmetric diff members Message-ID: <31e38ba4e150c9bc9e3aa1073869881ccba9035e.1723540931.git.ps@pks.im> References: <cover.1722933642.git.ps@pks.im> <cover.1723540931.git.ps@pks.im> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <cover.1723540931.git.ps@pks.im>
Series	Memory leak fixes (pt.4) \| expand [v3,00/22] Memory leak fixes (pt.4) [v3,01/22] remote: plug memory leak when aliasing URLs [v3,02/22] git: fix leaking system paths [v3,03/22] object-file: fix memory leak when reading corrupted headers [v3,04/22] object-name: fix leaking symlink paths in object context [v3,05/22] bulk-checkin: fix leaking state TODO [v3,06/22] read-cache: fix leaking hashfile when writing index fails [v3,07/22] submodule-config: fix leaking name entry when traversing submodules [v3,08/22] config: fix leaking comment character config [v3,09/22] builtin/rebase: fix leaking `commit.gpgsign` value [v3,10/22] builtin/notes: fix leaking `struct notes_tree` when merging notes [v3,11/22] builtin/fast-import: plug trivial memory leaks [v3,12/22] builtin/fast-export: fix leaking diff options [v3,13/22] builtin/fast-export: plug leaking tag names [v3,14/22] merge-ort: unconditionally release attributes index [v3,15/22] sequencer: release todo list on error paths [v3,16/22] unpack-trees: clear index when not propagating it [v3,17/22] diff: fix leak when parsing invalid ignore regex option [v3,18/22] builtin/format-patch: fix various trivial memory leaks [v3,19/22] userdiff: fix leaking memory for configured diff drivers [v3,20/22] builtin/log: fix leak when showing converted blob contents [v3,21/22] diff: free state populated via options [v3,22/22] builtin/diff: free symmetric diff members

Message ID

31e38ba4e150c9bc9e3aa1073869881ccba9035e.1723540931.git.ps@pks.im (mailing list archive)

State

Superseded

Headers

Feedback-ID: i197146af:Fastmail
Date: Tue, 13 Aug 2024 11:32:16 +0200
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: James Liu <james@jamesliu.io>, karthik nayak <karthik.188@gmail.com>,
	Phillip Wood <phillip.wood123@gmail.com>,
	Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com>
Subject: [PATCH v3 22/22] builtin/diff: free symmetric diff members
Message-ID: 
 <31e38ba4e150c9bc9e3aa1073869881ccba9035e.1723540931.git.ps@pks.im>
References: <cover.1722933642.git.ps@pks.im>
 <cover.1723540931.git.ps@pks.im>
Precedence: bulk
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <cover.1723540931.git.ps@pks.im>

Series

Memory leak fixes (pt.4) | expand

Commit Message

Patrick Steinhardt Aug. 13, 2024, 9:32 a.m. UTC

We populate a `struct symdiff` in case the user has requested a
symmetric diff. Part of this is to populate a `skip` bitmap that
indicates which commits shall be ignored in the diff. But while this
bitmap is dynamically allocated, we never free it.

Fix this by introducing and calling a new `symdiff_release()` function
that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/diff.c                       | 10 +++++++++-
 t/t4068-diff-symmetric-merge-base.sh |  1 +
 t/t4108-apply-threeway.sh            |  1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

Comments

Junio C Hamano Aug. 13, 2024, 4:25 p.m. UTC | #1

Patrick Steinhardt <ps@pks.im> writes:

> We populate a `struct symdiff` in case the user has requested a
> symmetric diff. Part of this is to populate a `skip` bitmap that
> indicates which commits shall be ignored in the diff. But while this
> bitmap is dynamically allocated, we never free it.
>
> Fix this by introducing and calling a new `symdiff_release()` function
> that does this for us.

OK.

> +static void symdiff_release(struct symdiff *sdiff)
> +{
> +	if (!sdiff)
> +		return;
> +	bitmap_free(sdiff->skip);
> +}

Hmph, wouldn't it be a BUG if any caller feeds a NULL pointer to it,
though?  When symdiff was prepared but not used, sdiff->skip will be
NULL but sdiff is never NULL even in such a case.

> @@ -398,7 +405,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
>  	struct object_array_entry *blob[2];
>  	int nongit = 0, no_index = 0;
>  	int result;
> -	struct symdiff sdiff;
> +	struct symdiff sdiff = {0};

And symdiff_prepare() at least clears its .skip member to NULL, so
this pre-initialization is probably not needed.  If we are preparing
ourselves for future changes of the flow in this function (e.g.
goto's that jump to the clean-up label from which symdiff_release()
is always called, even when we did not call symdiff_prepare() on
this thing), this is probably not sufficient to convey that
intention (instead I'd use an explicit ".skip = NULL" to say "we
might not even call _prepare() but this one is prepared to be passed
to _release() even in such a case").

Given that there is no such goto exists, and that _prepare() always
sets up the .skip member appropriately, I wonder if we are much
better off leaving sdiff uninitialized at the declaration site here.
If we add such a goto that bypasses _prepare() in the future, the
compiler will notice that we are passing an uninitialized sdiff to
_release(), no?

> @@ -619,6 +626,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
>  		refresh_index_quietly();
>  	release_revisions(&rev);
>  	object_array_clear(&ent);
> +	symdiff_release(&sdiff);
>  	UNLEAK(blob);
>  	return result;
>  }

Other than that, this looks cleanly done.  Thanks.

Patrick Steinhardt Aug. 14, 2024, 5:01 a.m. UTC | #2

On Tue, Aug 13, 2024 at 09:25:41AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > We populate a `struct symdiff` in case the user has requested a
> > symmetric diff. Part of this is to populate a `skip` bitmap that
> > indicates which commits shall be ignored in the diff. But while this
> > bitmap is dynamically allocated, we never free it.
> >
> > Fix this by introducing and calling a new `symdiff_release()` function
> > that does this for us.
> 
> OK.
> 
> > +static void symdiff_release(struct symdiff *sdiff)
> > +{
> > +	if (!sdiff)
> > +		return;
> > +	bitmap_free(sdiff->skip);
> > +}
> 
> Hmph, wouldn't it be a BUG if any caller feeds a NULL pointer to it,
> though?  When symdiff was prepared but not used, sdiff->skip will be
> NULL but sdiff is never NULL even in such a case.

Good point. It does make sense for `_free()` functions to handle NULL
pointers, but doesn't quite for `_release()` ones.

> > @@ -398,7 +405,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
> >  	struct object_array_entry *blob[2];
> >  	int nongit = 0, no_index = 0;
> >  	int result;
> > -	struct symdiff sdiff;
> > +	struct symdiff sdiff = {0};
> 
> And symdiff_prepare() at least clears its .skip member to NULL, so
> this pre-initialization is probably not needed.  If we are preparing
> ourselves for future changes of the flow in this function (e.g.
> goto's that jump to the clean-up label from which symdiff_release()
> is always called, even when we did not call symdiff_prepare() on
> this thing), this is probably not sufficient to convey that
> intention (instead I'd use an explicit ".skip = NULL" to say "we
> might not even call _prepare() but this one is prepared to be passed
> to _release() even in such a case").
> 
> Given that there is no such goto exists, and that _prepare() always
> sets up the .skip member appropriately, I wonder if we are much
> better off leaving sdiff uninitialized at the declaration site here.
> If we add such a goto that bypasses _prepare() in the future, the
> compiler will notice that we are passing an uninitialized sdiff to
> _release(), no?

You'd hope it does, but it certainly depends on your compiler flags.
Various hardening flags for example implicitly initialize variables, and
I have a feeling that this also causes them to not emit any warnings
anymore. At least I only spot such warnings in CI.

In any case, yes, we can drop the initialization here.

Patrick

Junio C Hamano Aug. 14, 2024, 3:28 p.m. UTC | #3

Patrick Steinhardt <ps@pks.im> writes:

> Good point. It does make sense for `_free()` functions to handle NULL
> pointers, but doesn't quite for `_release()` ones.

I agree that foo_free() should accept NULL and silently become a
no-op.  I do not care deeply whether foo_release() did the same, or
not, as long as all *_release()s behave the same way.  Maybe it is
more convenient if they ignored NULL, as I have a hunch that feeding
a NULL pointer to foo_release() is unlikely to be a bug.

Since we documented our aspiration to use these (and foo_clear())
consistently, we may #leftoverbits want to also document the calling
convention as well.

>> And symdiff_prepare() at least clears its .skip member to NULL, so
>> this pre-initialization is probably not needed.  If we are preparing
>> ourselves for future changes of the flow in this function (e.g.
>> goto's that jump to the clean-up label from which symdiff_release()
>> is always called, even when we did not call symdiff_prepare() on
>> this thing), this is probably not sufficient to convey that
>> intention (instead I'd use an explicit ".skip = NULL" to say "we
>> might not even call _prepare() but this one is prepared to be passed
>> to _release() even in such a case").
>> 
>> Given that there is no such goto exists, and that _prepare() always
>> sets up the .skip member appropriately, I wonder if we are much
>> better off leaving sdiff uninitialized at the declaration site here.
>> If we add such a goto that bypasses _prepare() in the future, the
>> compiler will notice that we are passing an uninitialized sdiff to
>> _release(), no?
>
> You'd hope it does, but it certainly depends on your compiler flags.
> Various hardening flags for example implicitly initialize variables, and
> I have a feeling that this also causes them to not emit any warnings
> anymore. At least I only spot such warnings in CI.

Yeah, that is a sad fact in the real world X-<.  To be defensive, I
think an explicit "{ .skip = NULL }" or "{ 0 }" would not be too bad
and may even serve as a good reminder for developers who may want to
jump over the call to _prepare() in the future.

The explicit ".skip = NULL" says "we know it is safe to call
_release() with a struct that hasn't gone through _prepare(), as
long as its .skip member is cleared", but the story "{ 0 }" tells us
is not much more than "we clear just like everybody else", and that
is why I suggested the former (iow, I know both mean the same thing
to the C compiler---I just care more about what it tells the human
readers).

Thanks.

diff --git a/builtin/diff.c b/builtin/diff.c
index 9b6cdabe15..f87f68a5bc 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -388,6 +388,13 @@  static void symdiff_prepare(struct rev_info *rev, struct symdiff *sym)
 	sym->skip = map;
 }
 
+static void symdiff_release(struct symdiff *sdiff)
+{
+	if (!sdiff)
+		return;
+	bitmap_free(sdiff->skip);
+}
+
 int cmd_diff(int argc, const char **argv, const char *prefix)
 {
 	int i;
@@ -398,7 +405,7 @@  int cmd_diff(int argc, const char **argv, const char *prefix)
 	struct object_array_entry *blob[2];
 	int nongit = 0, no_index = 0;
 	int result;
-	struct symdiff sdiff;
+	struct symdiff sdiff = {0};
 
 	/*
 	 * We could get N tree-ish in the rev.pending_objects list.
@@ -619,6 +626,7 @@  int cmd_diff(int argc, const char **argv, const char *prefix)
 		refresh_index_quietly();
 	release_revisions(&rev);
 	object_array_clear(&ent);
+	symdiff_release(&sdiff);
 	UNLEAK(blob);
 	return result;
 }
diff --git a/t/t4068-diff-symmetric-merge-base.sh b/t/t4068-diff-symmetric-merge-base.sh
index eff63c16b0..4d6565e728 100755
--- a/t/t4068-diff-symmetric-merge-base.sh
+++ b/t/t4068-diff-symmetric-merge-base.sh
@@ -5,6 +5,7 @@  test_description='behavior of diff with symmetric-diff setups and --merge-base'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # build these situations:
diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh
index c558282bc0..3211e1e65f 100755
--- a/t/t4108-apply-threeway.sh
+++ b/t/t4108-apply-threeway.sh
@@ -5,6 +5,7 @@  test_description='git apply --3way'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 print_sanitized_conflicted_diff () {

[v3,22/22] builtin/diff: free symmetric diff members

Commit Message

Comments

Patch