diff mbox series

[02/11] midx.c: don't leak MIDX from verify_midx_file

Message ID b0c79904ab7bdaee7a1bc7a55b0fb26b1f2cf8d3.1634787555.git.me@ttaylorr.com (mailing list archive)
State Superseded
Headers show
Series midx: clean up t5319 under 'SANITIZE=leak' | expand

Commit Message

Taylor Blau Oct. 21, 2021, 3:39 a.m. UTC
The function midx.c:verify_midx_file() allocate a MIDX struct by calling
load_multi_pack_index(). But when cleaning up, it calls free() without
freeing any resources associated with the MIDX.

Call the more appropriate close_midx() which does free those resources,
which causes t5319.3 to pass when Git is compiled with SANITIZE=leak.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 midx.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Eric Sunshine Oct. 21, 2021, 5 a.m. UTC | #1
On Wed, Oct 20, 2021 at 11:39 PM Taylor Blau <me@ttaylorr.com> wrote:
> The function midx.c:verify_midx_file() allocate a MIDX struct by calling
> load_multi_pack_index(). But when cleaning up, it calls free() without
> freeing any resources associated with the MIDX.

s/allocate/allocates/

> Call the more appropriate close_midx() which does free those resources,
> which causes t5319.3 to pass when Git is compiled with SANITIZE=leak.
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
> diff --git a/midx.c b/midx.c
> @@ -1611,7 +1611,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
> -               return verify_midx_error;
> +               goto cleanup;
>         }
> @@ -1689,7 +1689,9 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
>         stop_progress(&progress);
>
> +cleanup:
>         free(pairs);
> +       close_midx(m);
>
>         return verify_midx_error;
>  }

Using the `goto` idiom to ensure cleanup makes perfect sense. For a
few reasons[*], I did spend a some moments wondering if the cognitive
load might be a bit lower by instead adding two close_midx() calls --
one at the early return and one just before the normal return --
rather than using a `goto`, but it's so subjective that it's not worth
worrying about.

FOOTNOTES

[*] First, unlike most cases in which the `goto` jumps over relatively
short blocks of code, the distance in this case between `goto` and the
new label is significant and it's not easy to see at a glance what is
being skipped and how important it might be. Second, `pairs` is still
NULL at the point of the `goto`; I spent extra time checking and
double-checking what free(pairs) was doing in this instance. Finally,
there are enough start/stop-progress calls in this function that it
requires extra mental effort to determine that this `goto` is indeed
safe (at least for the present).
Junio C Hamano Oct. 21, 2021, 5:54 a.m. UTC | #2
Eric Sunshine <sunshine@sunshineco.com> writes:

> On Wed, Oct 20, 2021 at 11:39 PM Taylor Blau <me@ttaylorr.com> wrote:
>> The function midx.c:verify_midx_file() allocate a MIDX struct by calling
>> load_multi_pack_index(). But when cleaning up, it calls free() without
>> freeing any resources associated with the MIDX.
>
> s/allocate/allocates/
>
>> Call the more appropriate close_midx() which does free those resources,
>> which causes t5319.3 to pass when Git is compiled with SANITIZE=leak.
>>
>> Signed-off-by: Taylor Blau <me@ttaylorr.com>
>> ---
>> diff --git a/midx.c b/midx.c
>> @@ -1611,7 +1611,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
>> -               return verify_midx_error;
>> +               goto cleanup;
>>         }
>> @@ -1689,7 +1689,9 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
>>         stop_progress(&progress);
>>
>> +cleanup:
>>         free(pairs);
>> +       close_midx(m);
>>
>>         return verify_midx_error;
>>  }
>
> Using the `goto` idiom to ensure cleanup makes perfect sense. For a
> few reasons[*], I did spend a some moments wondering if the cognitive
> load might be a bit lower by instead adding two close_midx() calls --
> one at the early return and one just before the normal return --
> rather than using a `goto`, but it's so subjective that it's not worth
> worrying about.
>
> FOOTNOTES
>
> [*] First, unlike most cases in which the `goto` jumps over relatively
> short blocks of code, the distance in this case between `goto` and the
> new label is significant and it's not easy to see at a glance what is
> being skipped and how important it might be. Second, `pairs` is still
> NULL at the point of the `goto`; I spent extra time checking and
> double-checking what free(pairs) was doing in this instance. Finally,
> there are enough start/stop-progress calls in this function that it
> requires extra mental effort to determine that this `goto` is indeed
> safe (at least for the present).

Carefully reading the patch is very good, but at least in a patch
like this, it is immediately obvious that the patch is not making
any of the things the footnote worries about any worse than the
original, by replacing "return" (which by definition will skip all
the things the goto did jump over) with a "goto" to the bottom, no?

Thanks.
Junio C Hamano Oct. 21, 2021, 4:27 p.m. UTC | #3
Taylor Blau <me@ttaylorr.com> writes:

> The function midx.c:verify_midx_file() allocate a MIDX struct by calling
> load_multi_pack_index(). But when cleaning up, it calls free() without
> freeing any resources associated with the MIDX.
>
> Call the more appropriate close_midx() which does free those resources,
> which causes t5319.3 to pass when Git is compiled with SANITIZE=leak.

Nice.

By the way, the function starts like so:

    int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags)
    {
            struct pair_pos_vs_id *pairs = NULL;
            uint32_t i;
            struct progress *progress = NULL;
            struct multi_pack_index *m = load_multi_pack_index(object_dir, 1);
            verify_midx_error = 0;

            if (!m) {
                    int result = 0;
                    struct stat sb;
                    char *filename = get_midx_filename(object_dir);
                    if (!stat(filename, &sb)) {
                            error(_("multi-pack-index file exists, but failed to parse"));
                            result = 1;
                    }
                    free(filename);
                    return result;
            }

but after seeing die() sprinkled in load_multi_pack_index() with
checks during parsing, I am not sure if this is a good error
reporting mechanism we are seeing here.

It is wonderful to plug leaks here and there, but to be honest, even
with only a very little parts I saw in this code, I think there are
other things that need clean-up here.

Also, the way the file-scope global verify_midx_error is used is
beyond words _ugly_, if the only reason for its use is to make
midx_report() look simpler, which is what I think is happening.

Not very impressed, but all of the above is not a new issue
introduced by this patch.

Thanks.
diff mbox series

Patch

diff --git a/midx.c b/midx.c
index 36e4754767..ad60e48468 100644
--- a/midx.c
+++ b/midx.c
@@ -1611,7 +1611,7 @@  int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 		 * Remaining tests assume that we have objects, so we can
 		 * return here.
 		 */
-		return verify_midx_error;
+		goto cleanup;
 	}
 
 	if (flags & MIDX_PROGRESS)
@@ -1689,7 +1689,9 @@  int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 	}
 	stop_progress(&progress);
 
+cleanup:
 	free(pairs);
+	close_midx(m);
 
 	return verify_midx_error;
 }