diff mbox series

[1/7] pack-write.c: plug a leak in stage_tmp_packfiles()

Message ID 18be29c3988295cd58521f8cc4a729897df074c8.1681166596.git.me@ttaylorr.com (mailing list archive)
State Superseded
Headers show
Series pack-revindex: enable on-disk reverse indexes by default | expand

Commit Message

Taylor Blau April 10, 2023, 10:53 p.m. UTC
The function `stage_tmp_packfiles()` generates a filename to use for
staging the contents of what will become the pack's ".rev" file.

The name is generated in `write_rev_file_order()` (via its caller
`write_rev_file()`) in a string buffer, and the result is returned back
to `stage_tmp_packfiles()` which uses it to rename the temporary file
into place via `rename_tmp_packfiles()`.

That name is not visible outside of `stage_tmp_packfiles()`, so it can
(and should) be `free()`'d at the end of that function. We can't free it
in `rename_tmp_packfile()` since not all of its `source` arguments are
unreachable after calling it.

Instead, simply free() `rev_tmp_name` at the end of
`stage_tmp_packfiles()`.

(Note that the same leak exists for `mtimes_tmp_name`, but we do not
address it in this commit).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-write.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Derrick Stolee April 11, 2023, 1:23 p.m. UTC | #1
On 4/10/2023 6:53 PM, Taylor Blau wrote:

> Instead, simply free() `rev_tmp_name` at the end of
> `stage_tmp_packfiles()`.

> @@ -568,6 +568,8 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
>  		rename_tmp_packfile(name_buffer, rev_tmp_name, "rev");
>  	if (mtimes_tmp_name)
>  		rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes");
> +
> +	free((char *)rev_tmp_name);

Just cut off from the context is a "if (rev_tmp_name)", so it might be
good to group this into that block, since we have the condition, anyway.


But I was also thinking about how we like to use "const" as an indicator
as "I am not responsible for free()ing this". And this comes from the
public write_rev_file() method. Based on the API prototype, we could
think that this string is held by a static strbuf (making the method
not reentrant, but that happens sometimes in our methods). But generally,
I wanted to inspect what it would take to make the API reflect the fact
that it can return a "new" string.

But there are two issues:

 1. The actual logic is inside write_rev_file_order(), so that API
    needs to change, too.

 2. The "new" string is created only if the rev_name parameter is
    NULL, which is somewhat understandable but still requires
    inside knowledge about the implementation to make that choice.

 3. If we inspect the callers to these methods, only one caller
    passes a non-null name: builtin/index-pack.c. The rest pass NULL,
    including write_midx_reverse_index() (which then leaks the name).

The below diff includes my attempt to change the API to return a
non-const string that must be freed by the callers.

Thanks,
-Stolee

--- >8 ---

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index b17e79cd40f..6d2fa52f9c4 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1725,7 +1725,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 {
 	int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index;
 	const char *curr_index;
-	const char *curr_rev_index = NULL;
+	char *curr_rev_index = NULL;
 	const char *index_name = NULL, *pack_name = NULL, *rev_index_name = NULL;
 	const char *keep_msg = NULL;
 	const char *promisor_msg = NULL;
@@ -1956,8 +1956,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 		free((void *) curr_pack);
 	if (!index_name)
 		free((void *) curr_index);
-	if (!rev_index_name)
-		free((void *) curr_rev_index);
+	free(curr_rev_index);
 
 	/*
 	 * Let the caller know this pack is not self contained
diff --git a/midx.c b/midx.c
index 9af3e5de889..85154bedd73 100644
--- a/midx.c
+++ b/midx.c
@@ -945,7 +945,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash,
 				     struct write_midx_context *ctx)
 {
 	struct strbuf buf = STRBUF_INIT;
-	const char *tmp_file;
+	char *tmp_file;
 
 	trace2_region_enter("midx", "write_midx_reverse_index", the_repository);
 
@@ -958,6 +958,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash,
 		die(_("cannot store reverse index file"));
 
 	strbuf_release(&buf);
+	free(tmp_file);
 
 	trace2_region_leave("midx", "write_midx_reverse_index", the_repository);
 }
diff --git a/pack-write.c b/pack-write.c
index f1714054951..73850c061d9 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -207,15 +207,15 @@ static void write_rev_trailer(struct hashfile *f, const unsigned char *hash)
 	hashwrite(f, hash, the_hash_algo->rawsz);
 }
 
-const char *write_rev_file(const char *rev_name,
-			   struct pack_idx_entry **objects,
-			   uint32_t nr_objects,
-			   const unsigned char *hash,
-			   unsigned flags)
+char *write_rev_file(const char *rev_name,
+		     struct pack_idx_entry **objects,
+		     uint32_t nr_objects,
+		     const unsigned char *hash,
+		     unsigned flags)
 {
 	uint32_t *pack_order;
 	uint32_t i;
-	const char *ret;
+	char *ret;
 
 	if (!(flags & WRITE_REV) && !(flags & WRITE_REV_VERIFY))
 		return NULL;
@@ -233,12 +233,13 @@ const char *write_rev_file(const char *rev_name,
 	return ret;
 }
 
-const char *write_rev_file_order(const char *rev_name,
-				 uint32_t *pack_order,
-				 uint32_t nr_objects,
-				 const unsigned char *hash,
-				 unsigned flags)
+char *write_rev_file_order(const char *rev_name,
+			   uint32_t *pack_order,
+			   uint32_t nr_objects,
+			   const unsigned char *hash,
+			   unsigned flags)
 {
+	char *ret_name;
 	struct hashfile *f;
 	int fd;
 
@@ -249,10 +250,11 @@ const char *write_rev_file_order(const char *rev_name,
 		if (!rev_name) {
 			struct strbuf tmp_file = STRBUF_INIT;
 			fd = odb_mkstemp(&tmp_file, "pack/tmp_rev_XXXXXX");
-			rev_name = strbuf_detach(&tmp_file, NULL);
+			rev_name = ret_name = strbuf_detach(&tmp_file, NULL);
 		} else {
 			unlink(rev_name);
 			fd = xopen(rev_name, O_CREAT|O_EXCL|O_WRONLY, 0600);
+			ret_name = xstrdup(rev_name);
 		}
 		f = hashfd(fd, rev_name);
 	} else if (flags & WRITE_REV_VERIFY) {
@@ -264,6 +266,7 @@ const char *write_rev_file_order(const char *rev_name,
 			} else
 				die_errno(_("could not stat: %s"), rev_name);
 		}
+		ret_name = xstrdup(rev_name);
 		f = hashfd_check(rev_name);
 	} else
 		return NULL;
@@ -280,7 +283,7 @@ const char *write_rev_file_order(const char *rev_name,
 			  CSUM_HASH_IN_STREAM | CSUM_CLOSE |
 			  ((flags & WRITE_IDX_VERIFY) ? 0 : CSUM_FSYNC));
 
-	return rev_name;
+	return ret_name;
 }
 
 static void write_mtimes_header(struct hashfile *f)
@@ -543,7 +546,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
 			 unsigned char hash[],
 			 char **idx_tmp_name)
 {
-	const char *rev_tmp_name = NULL;
+	char *rev_tmp_name = NULL;
 	const char *mtimes_tmp_name = NULL;
 
 	if (adjust_shared_perm(pack_tmp_name))
@@ -564,8 +567,10 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
 	}
 
 	rename_tmp_packfile(name_buffer, pack_tmp_name, "pack");
-	if (rev_tmp_name)
+	if (rev_tmp_name) {
 		rename_tmp_packfile(name_buffer, rev_tmp_name, "rev");
+		free(rev_tmp_name);
+	}
 	if (mtimes_tmp_name)
 		rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes");
 }
diff --git a/pack.h b/pack.h
index 3ab9e3f60c0..02bbdfb19cc 100644
--- a/pack.h
+++ b/pack.h
@@ -96,8 +96,8 @@ struct ref;
 
 void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought);
 
-const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
-const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
+char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
+char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
 
 /*
  * The "hdr" output buffer should be at least this big, which will handle sizes
Taylor Blau April 11, 2023, 9:25 p.m. UTC | #2
On Tue, Apr 11, 2023 at 09:23:31AM -0400, Derrick Stolee wrote:
> On 4/10/2023 6:53 PM, Taylor Blau wrote:
>
> > Instead, simply free() `rev_tmp_name` at the end of
> > `stage_tmp_packfiles()`.
>
> > @@ -568,6 +568,8 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
> >  		rename_tmp_packfile(name_buffer, rev_tmp_name, "rev");
> >  	if (mtimes_tmp_name)
> >  		rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes");
> > +
> > +	free((char *)rev_tmp_name);
>
> Just cut off from the context is a "if (rev_tmp_name)", so it might be
> good to group this into that block, since we have the condition, anyway.

Definitely possible, though FWIW it's fine to have this free()
positioned at the end of the function, since we initialize rev_tmp_name
to NULL (making this a noop when not writing an on-disk reverse index).

> But I was also thinking about how we like to use "const" as an indicator
> as "I am not responsible for free()ing this". And this comes from the
> public write_rev_file() method. Based on the API prototype, we could
> think that this string is held by a static strbuf (making the method
> not reentrant, but that happens sometimes in our methods). But generally,
> I wanted to inspect what it would take to make the API reflect the fact
> that it can return a "new" string.
>
> But there are two issues:
>
>  1. The actual logic is inside write_rev_file_order(), so that API
>     needs to change, too.
>
>  2. The "new" string is created only if the rev_name parameter is
>     NULL, which is somewhat understandable but still requires
>     inside knowledge about the implementation to make that choice.
>
>  3. If we inspect the callers to these methods, only one caller
>     passes a non-null name: builtin/index-pack.c. The rest pass NULL,
>     including write_midx_reverse_index() (which then leaks the name).
>
> The below diff includes my attempt to change the API to return a
> non-const string that must be freed by the callers.

I like this direction. I think that all things being equal (and unless
you feel strongly about it in the meantime), I'd just as soon pursue
this as a "fast follow" to avoid intermixing this API change with the
primary intent of this series.

In the meantime, dropping the const via a cast down to "char *" works
fine to plug the leak here.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/pack-write.c b/pack-write.c
index f171405495..f27c1f7f28 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -568,6 +568,8 @@  void stage_tmp_packfiles(struct strbuf *name_buffer,
 		rename_tmp_packfile(name_buffer, rev_tmp_name, "rev");
 	if (mtimes_tmp_name)
 		rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes");
+
+	free((char *)rev_tmp_name);
 }
 
 void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought)