diff mbox series

[05/11] builtin/repack.c: avoid leaking child arguments

Message ID a68c77c00638bef7e8ce88929015a587d2d1b6f8.1634787555.git.me@ttaylorr.com (mailing list archive)
State Superseded
Headers show
Series midx: clean up t5319 under 'SANITIZE=leak' | expand

Commit Message

Taylor Blau Oct. 21, 2021, 3:39 a.m. UTC
`git repack` invokes a handful of child processes: one to write the
actual pack, and optionally ones to repack promisor objects and update
the MIDX.

In none of these cases do we bother to call child_process_clear(), which
frees the memory associated with each child's arguments and environment.

In order to do so, tweak each function that spawns a child process to
have a `cleanup` label that we always visit before returning from each
function. Then, make sure that we call child_process_clear() as a part
of that label.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/repack.c | 35 ++++++++++++++++++++++++-----------
 1 file changed, 24 insertions(+), 11 deletions(-)

Comments

Derrick Stolee Oct. 21, 2021, 1:32 p.m. UTC | #1
On 10/20/2021 11:39 PM, Taylor Blau wrote:
> `git repack` invokes a handful of child processes: one to write the
> actual pack, and optionally ones to repack promisor objects and update
> the MIDX.
> 
> In none of these cases do we bother to call child_process_clear(), which
> frees the memory associated with each child's arguments and environment.
> 
> In order to do so, tweak each function that spawns a child process to
> have a `cleanup` label that we always visit before returning from each
> function. Then, make sure that we call child_process_clear() as a part
> of that label.
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  builtin/repack.c | 35 ++++++++++++++++++++++++-----------
>  1 file changed, 24 insertions(+), 11 deletions(-)
> 
> diff --git a/builtin/repack.c b/builtin/repack.c
> index 0b2d1e5d82..d16bab09a4 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -244,6 +244,7 @@ static void repack_promisor_objects(const struct pack_objects_args *args,
>  	struct child_process cmd = CHILD_PROCESS_INIT;
>  	FILE *out;
>  	struct strbuf line = STRBUF_INIT;
> +	int ret = 0;

nit: "ret" is short for "return" which makes me think "this will
be used as 'return ret;'" but we don't do that. Instead, we use
it as the _result_ of an inner call:

> -	if (finish_command(&cmd))
> +	ret = finish_command(&cmd);
> +
> +cleanup:
> +	child_process_clear(&cmd);
> +
> +	if (ret)
>  		die(_("could not finish pack-objects to repack promisor objects"));

For that reason, I would rename this to "res" or "result".

> -	return finish_command(&cmd);
> +	ret = finish_command(&cmd);
> +
> +cleanup:
> +	child_process_clear(&cmd);
> +
> +	return ret;

Here, you are taking the result of an inner call, but actually
returning it. This makes sense to be "ret".

> +cleanup:
>  	string_list_clear(&names, 0);
>  	string_list_clear(&rollback, 0);
>  	string_list_clear(&existing_nonkept_packs, 0);
>  	string_list_clear(&existing_kept_packs, 0);
>  	clear_pack_geometry(geometry);
>  	strbuf_release(&line);
> +	child_process_clear(&cmd);
>  
> -	return 0;
> +	return ret;

Also here is good.

Thanks,
-Stolee
Junio C Hamano Oct. 21, 2021, 4:37 p.m. UTC | #2
Taylor Blau <me@ttaylorr.com> writes:

> `git repack` invokes a handful of child processes: one to write the
> actual pack, and optionally ones to repack promisor objects and update
> the MIDX.
>
> In none of these cases do we bother to call child_process_clear(), which
> frees the memory associated with each child's arguments and environment.
>
> In order to do so, tweak each function that spawns a child process to
> have a `cleanup` label that we always visit before returning from each
> function. Then, make sure that we call child_process_clear() as a part
> of that label.

I have a slight aversion against the fact that this patch makes us
call child_process_clear(), when we know we may have called
finish_command().  Callers of finish_command() in other codepaths
rely on the fact that finish_command() calls child_process_clear(),
which means that because of this change, we are now constrained to
keep child_process_clear() idempotent?
Junio C Hamano Oct. 21, 2021, 6:47 p.m. UTC | #3
Derrick Stolee <stolee@gmail.com> writes:

>>  	struct child_process cmd = CHILD_PROCESS_INIT;
>>  	FILE *out;
>>  	struct strbuf line = STRBUF_INIT;
>> +	int ret = 0;
>
> nit: "ret" is short for "return" which makes me think "this will
> be used as 'return ret;'" but we don't do that. Instead, we use
> it as the _result_ of an inner call:

Yup, my reading hiccupped there, too.

>
>> -	if (finish_command(&cmd))
>> +	ret = finish_command(&cmd);
>> +
>> +cleanup:
>> +	child_process_clear(&cmd);
>> +
>> +	if (ret)
>>  		die(_("could not finish pack-objects to repack promisor objects"));
>
> For that reason, I would rename this to "res" or "result".

Yeah, if we are introducing a new variable, result is a good name.
Taylor Blau Oct. 22, 2021, 3:21 a.m. UTC | #4
On Thu, Oct 21, 2021 at 09:37:07AM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > `git repack` invokes a handful of child processes: one to write the
> > actual pack, and optionally ones to repack promisor objects and update
> > the MIDX.
> >
> > In none of these cases do we bother to call child_process_clear(), which
> > frees the memory associated with each child's arguments and environment.
> >
> > In order to do so, tweak each function that spawns a child process to
> > have a `cleanup` label that we always visit before returning from each
> > function. Then, make sure that we call child_process_clear() as a part
> > of that label.
>
> I have a slight aversion against the fact that this patch makes us
> call child_process_clear(), when we know we may have called
> finish_command().  Callers of finish_command() in other codepaths
> rely on the fact that finish_command() calls child_process_clear(),
> which means that because of this change, we are now constrained to
> keep child_process_clear() idempotent?

Good point. In functions besides cmd_repack(), it's easy enough to just
call child_process_clear() once before returning instead of using a
label. That has the added benefit of side-stepping the variable name
question that you and Stolee raised (but I think whether we call it
"result", "res", or "ret" is besides the point).

In cmd_repack(), I'd like to wrap the other cleanup-related calls
underneath a 'cleanup' label. So there we want to track whether or not
we have called finish_command(), so that we only call
child_process_clear() when we *haven't* already called it via
finish_command().

Thanks,
Taylor
diff mbox series

Patch

diff --git a/builtin/repack.c b/builtin/repack.c
index 0b2d1e5d82..d16bab09a4 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -244,6 +244,7 @@  static void repack_promisor_objects(const struct pack_objects_args *args,
 	struct child_process cmd = CHILD_PROCESS_INIT;
 	FILE *out;
 	struct strbuf line = STRBUF_INIT;
+	int ret = 0;
 
 	prepare_pack_objects(&cmd, args);
 	cmd.in = -1;
@@ -260,7 +261,7 @@  static void repack_promisor_objects(const struct pack_objects_args *args,
 
 	if (cmd.in == -1)
 		/* No packed objects; cmd was never started */
-		return;
+		goto cleanup;
 
 	close(cmd.in);
 
@@ -293,7 +294,12 @@  static void repack_promisor_objects(const struct pack_objects_args *args,
 		free(promisor_name);
 	}
 	fclose(out);
-	if (finish_command(&cmd))
+	ret = finish_command(&cmd);
+
+cleanup:
+	child_process_clear(&cmd);
+
+	if (ret)
 		die(_("could not finish pack-objects to repack promisor objects"));
 }
 
@@ -559,10 +565,10 @@  static int write_midx_included_packs(struct string_list *include,
 	struct string_list_item *item;
 	struct packed_git *largest = get_largest_active_pack(geometry);
 	FILE *in;
-	int ret;
+	int ret = 0;
 
 	if (!include->nr)
-		return 0;
+		goto cleanup;
 
 	cmd.in = -1;
 	cmd.git_cmd = 1;
@@ -587,14 +593,19 @@  static int write_midx_included_packs(struct string_list *include,
 
 	ret = start_command(&cmd);
 	if (ret)
-		return ret;
+		goto cleanup;
 
 	in = xfdopen(cmd.in, "w");
 	for_each_string_list_item(item, include)
 		fprintf(in, "%s\n", item->string);
 	fclose(in);
 
-	return finish_command(&cmd);
+	ret = finish_command(&cmd);
+
+cleanup:
+	child_process_clear(&cmd);
+
+	return ret;
 }
 
 int cmd_repack(int argc, const char **argv, const char *prefix)
@@ -608,7 +619,7 @@  int cmd_repack(int argc, const char **argv, const char *prefix)
 	struct pack_geometry *geometry = NULL;
 	struct strbuf line = STRBUF_INIT;
 	struct tempfile *refs_snapshot = NULL;
-	int i, ext, ret;
+	int i, ext, ret = 0;
 	FILE *out;
 	int show_progress = isatty(2);
 
@@ -794,7 +805,7 @@  int cmd_repack(int argc, const char **argv, const char *prefix)
 
 	ret = start_command(&cmd);
 	if (ret)
-		return ret;
+		goto cleanup;
 
 	if (geometry) {
 		FILE *in = xfdopen(cmd.in, "w");
@@ -819,7 +830,7 @@  int cmd_repack(int argc, const char **argv, const char *prefix)
 	fclose(out);
 	ret = finish_command(&cmd);
 	if (ret)
-		return ret;
+		goto cleanup;
 
 	if (!names.nr && !po_args.quiet)
 		printf_ln(_("Nothing new to pack."));
@@ -893,7 +904,7 @@  int cmd_repack(int argc, const char **argv, const char *prefix)
 		string_list_clear(&include, 0);
 
 		if (ret)
-			return ret;
+			goto cleanup;
 	}
 
 	reprepare_packed_git(the_repository);
@@ -946,12 +957,14 @@  int cmd_repack(int argc, const char **argv, const char *prefix)
 		write_midx_file(get_object_directory(), NULL, NULL, flags);
 	}
 
+cleanup:
 	string_list_clear(&names, 0);
 	string_list_clear(&rollback, 0);
 	string_list_clear(&existing_nonkept_packs, 0);
 	string_list_clear(&existing_kept_packs, 0);
 	clear_pack_geometry(geometry);
 	strbuf_release(&line);
+	child_process_clear(&cmd);
 
-	return 0;
+	return ret;
 }