Message ID | 18be29c3988295cd58521f8cc4a729897df074c8.1681166596.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | pack-revindex: enable on-disk reverse indexes by default | expand |
On 4/10/2023 6:53 PM, Taylor Blau wrote: > Instead, simply free() `rev_tmp_name` at the end of > `stage_tmp_packfiles()`. > @@ -568,6 +568,8 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, > rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); > if (mtimes_tmp_name) > rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes"); > + > + free((char *)rev_tmp_name); Just cut off from the context is a "if (rev_tmp_name)", so it might be good to group this into that block, since we have the condition, anyway. But I was also thinking about how we like to use "const" as an indicator as "I am not responsible for free()ing this". And this comes from the public write_rev_file() method. Based on the API prototype, we could think that this string is held by a static strbuf (making the method not reentrant, but that happens sometimes in our methods). But generally, I wanted to inspect what it would take to make the API reflect the fact that it can return a "new" string. But there are two issues: 1. The actual logic is inside write_rev_file_order(), so that API needs to change, too. 2. The "new" string is created only if the rev_name parameter is NULL, which is somewhat understandable but still requires inside knowledge about the implementation to make that choice. 3. If we inspect the callers to these methods, only one caller passes a non-null name: builtin/index-pack.c. The rest pass NULL, including write_midx_reverse_index() (which then leaks the name). The below diff includes my attempt to change the API to return a non-const string that must be freed by the callers. Thanks, -Stolee --- >8 --- diff --git a/builtin/index-pack.c b/builtin/index-pack.c index b17e79cd40f..6d2fa52f9c4 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1725,7 +1725,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) { int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index; const char *curr_index; - const char *curr_rev_index = NULL; + char *curr_rev_index = NULL; const char *index_name = NULL, *pack_name = NULL, *rev_index_name = NULL; const char *keep_msg = NULL; const char *promisor_msg = NULL; @@ -1956,8 +1956,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) free((void *) curr_pack); if (!index_name) free((void *) curr_index); - if (!rev_index_name) - free((void *) curr_rev_index); + free(curr_rev_index); /* * Let the caller know this pack is not self contained diff --git a/midx.c b/midx.c index 9af3e5de889..85154bedd73 100644 --- a/midx.c +++ b/midx.c @@ -945,7 +945,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash, struct write_midx_context *ctx) { struct strbuf buf = STRBUF_INIT; - const char *tmp_file; + char *tmp_file; trace2_region_enter("midx", "write_midx_reverse_index", the_repository); @@ -958,6 +958,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash, die(_("cannot store reverse index file")); strbuf_release(&buf); + free(tmp_file); trace2_region_leave("midx", "write_midx_reverse_index", the_repository); } diff --git a/pack-write.c b/pack-write.c index f1714054951..73850c061d9 100644 --- a/pack-write.c +++ b/pack-write.c @@ -207,15 +207,15 @@ static void write_rev_trailer(struct hashfile *f, const unsigned char *hash) hashwrite(f, hash, the_hash_algo->rawsz); } -const char *write_rev_file(const char *rev_name, - struct pack_idx_entry **objects, - uint32_t nr_objects, - const unsigned char *hash, - unsigned flags) +char *write_rev_file(const char *rev_name, + struct pack_idx_entry **objects, + uint32_t nr_objects, + const unsigned char *hash, + unsigned flags) { uint32_t *pack_order; uint32_t i; - const char *ret; + char *ret; if (!(flags & WRITE_REV) && !(flags & WRITE_REV_VERIFY)) return NULL; @@ -233,12 +233,13 @@ const char *write_rev_file(const char *rev_name, return ret; } -const char *write_rev_file_order(const char *rev_name, - uint32_t *pack_order, - uint32_t nr_objects, - const unsigned char *hash, - unsigned flags) +char *write_rev_file_order(const char *rev_name, + uint32_t *pack_order, + uint32_t nr_objects, + const unsigned char *hash, + unsigned flags) { + char *ret_name; struct hashfile *f; int fd; @@ -249,10 +250,11 @@ const char *write_rev_file_order(const char *rev_name, if (!rev_name) { struct strbuf tmp_file = STRBUF_INIT; fd = odb_mkstemp(&tmp_file, "pack/tmp_rev_XXXXXX"); - rev_name = strbuf_detach(&tmp_file, NULL); + rev_name = ret_name = strbuf_detach(&tmp_file, NULL); } else { unlink(rev_name); fd = xopen(rev_name, O_CREAT|O_EXCL|O_WRONLY, 0600); + ret_name = xstrdup(rev_name); } f = hashfd(fd, rev_name); } else if (flags & WRITE_REV_VERIFY) { @@ -264,6 +266,7 @@ const char *write_rev_file_order(const char *rev_name, } else die_errno(_("could not stat: %s"), rev_name); } + ret_name = xstrdup(rev_name); f = hashfd_check(rev_name); } else return NULL; @@ -280,7 +283,7 @@ const char *write_rev_file_order(const char *rev_name, CSUM_HASH_IN_STREAM | CSUM_CLOSE | ((flags & WRITE_IDX_VERIFY) ? 0 : CSUM_FSYNC)); - return rev_name; + return ret_name; } static void write_mtimes_header(struct hashfile *f) @@ -543,7 +546,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, unsigned char hash[], char **idx_tmp_name) { - const char *rev_tmp_name = NULL; + char *rev_tmp_name = NULL; const char *mtimes_tmp_name = NULL; if (adjust_shared_perm(pack_tmp_name)) @@ -564,8 +567,10 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, } rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); - if (rev_tmp_name) + if (rev_tmp_name) { rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); + free(rev_tmp_name); + } if (mtimes_tmp_name) rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes"); } diff --git a/pack.h b/pack.h index 3ab9e3f60c0..02bbdfb19cc 100644 --- a/pack.h +++ b/pack.h @@ -96,8 +96,8 @@ struct ref; void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought); -const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags); -const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, unsigned flags); +char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags); +char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, unsigned flags); /* * The "hdr" output buffer should be at least this big, which will handle sizes
On Tue, Apr 11, 2023 at 09:23:31AM -0400, Derrick Stolee wrote: > On 4/10/2023 6:53 PM, Taylor Blau wrote: > > > Instead, simply free() `rev_tmp_name` at the end of > > `stage_tmp_packfiles()`. > > > @@ -568,6 +568,8 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, > > rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); > > if (mtimes_tmp_name) > > rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes"); > > + > > + free((char *)rev_tmp_name); > > Just cut off from the context is a "if (rev_tmp_name)", so it might be > good to group this into that block, since we have the condition, anyway. Definitely possible, though FWIW it's fine to have this free() positioned at the end of the function, since we initialize rev_tmp_name to NULL (making this a noop when not writing an on-disk reverse index). > But I was also thinking about how we like to use "const" as an indicator > as "I am not responsible for free()ing this". And this comes from the > public write_rev_file() method. Based on the API prototype, we could > think that this string is held by a static strbuf (making the method > not reentrant, but that happens sometimes in our methods). But generally, > I wanted to inspect what it would take to make the API reflect the fact > that it can return a "new" string. > > But there are two issues: > > 1. The actual logic is inside write_rev_file_order(), so that API > needs to change, too. > > 2. The "new" string is created only if the rev_name parameter is > NULL, which is somewhat understandable but still requires > inside knowledge about the implementation to make that choice. > > 3. If we inspect the callers to these methods, only one caller > passes a non-null name: builtin/index-pack.c. The rest pass NULL, > including write_midx_reverse_index() (which then leaks the name). > > The below diff includes my attempt to change the API to return a > non-const string that must be freed by the callers. I like this direction. I think that all things being equal (and unless you feel strongly about it in the meantime), I'd just as soon pursue this as a "fast follow" to avoid intermixing this API change with the primary intent of this series. In the meantime, dropping the const via a cast down to "char *" works fine to plug the leak here. Thanks, Taylor
diff --git a/pack-write.c b/pack-write.c index f171405495..f27c1f7f28 100644 --- a/pack-write.c +++ b/pack-write.c @@ -568,6 +568,8 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); if (mtimes_tmp_name) rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes"); + + free((char *)rev_tmp_name); } void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought)
The function `stage_tmp_packfiles()` generates a filename to use for staging the contents of what will become the pack's ".rev" file. The name is generated in `write_rev_file_order()` (via its caller `write_rev_file()`) in a string buffer, and the result is returned back to `stage_tmp_packfiles()` which uses it to rename the temporary file into place via `rename_tmp_packfiles()`. That name is not visible outside of `stage_tmp_packfiles()`, so it can (and should) be `free()`'d at the end of that function. We can't free it in `rename_tmp_packfile()` since not all of its `source` arguments are unreachable after calling it. Instead, simply free() `rev_tmp_name` at the end of `stage_tmp_packfiles()`. (Note that the same leak exists for `mtimes_tmp_name`, but we do not address it in this commit). Signed-off-by: Taylor Blau <me@ttaylorr.com> --- pack-write.c | 2 ++ 1 file changed, 2 insertions(+)