diff mbox series

[01/11] midx.c: clean up chunkfile after reading the MIDX

Message ID 30f6f23daf49814f479865eea5f9ee68de209d5f.1634787555.git.me@ttaylorr.com (mailing list archive)
State Superseded
Headers show
Series midx: clean up t5319 under 'SANITIZE=leak' | expand

Commit Message

Taylor Blau Oct. 21, 2021, 3:39 a.m. UTC
In order to read the contents of a MIDX, we initialize a chunkfile
structure which can read the table of contents and assign pointers into
different sections of the file for us.

We do call free(), since the chunkfile struct is heap allocated, but not
the more appropriate free_chunkfile(), which also frees memory that the
structure itself owns.

Call that instead to avoid leaking memory in this function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
The patch contents are from Ævar, but the message is mine. I hope that
he doesn't mind me forging his sign-off here.

 midx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--
2.33.0.96.g73915697e6

Comments

Junio C Hamano Oct. 21, 2021, 5:50 a.m. UTC | #1
Taylor Blau <me@ttaylorr.com> writes:

> The patch contents are from Ævar, but the message is mine. I hope that
> he doesn't mind me forging his sign-off here.
>
>  midx.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/midx.c b/midx.c
> index 8433086ac1..36e4754767 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -179,12 +179,13 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local
>  	trace2_data_intmax("midx", the_repository, "load/num_packs", m->num_packs);
>  	trace2_data_intmax("midx", the_repository, "load/num_objects", m->num_objects);
>
> +	free_chunkfile(cf);
>  	return m;
>
>  cleanup_fail:
>  	free(m);
>  	free(midx_name);
> -	free(cf);
> +	free_chunkfile(cf);
>  	if (midx_map)
>  		munmap(midx_map, midx_size);
>  	if (0 <= fd)

The former is not something we can mechanically locate, but the
latter we should be able to.  And indeed this is the only instance
the following experiment finds.

	$ cat >contrib/coccinelle/chunkfile.cocci <<-\EOF
	@@
	identifier f !~ "^free_chunkfile$";
	struct chunkfile *cf;
	@@
	  f(...) {<...
	- free(cf)
	+ free_chunkfile(cf)
	  ...>}
	EOF
	$ make contrib/coccinelle/chunkfile.cocci.patch

Thanks.
Ævar Arnfjörð Bjarmason Oct. 21, 2021, 11:34 a.m. UTC | #2
On Wed, Oct 20 2021, Taylor Blau wrote:

> The patch contents are from Ævar, but the message is mine. I hope that
> he doesn't mind me forging his sign-off here.

I don't mind, and (and maybe we should have an in-tree text file for
this) like I think Jeff King has noted for his inline-posted patches for
discussion you can assume my SOB for anything I post here on list
(except noted otherwise) in terms of copyright.
Junio C Hamano Oct. 21, 2021, 4:16 p.m. UTC | #3
Taylor Blau <me@ttaylorr.com> writes:

>  cleanup_fail:
>  	free(m);
>  	free(midx_name);
> -	free(cf);
> +	free_chunkfile(cf);
>  	if (midx_map)
>  		munmap(midx_map, midx_size);
>  	if (0 <= fd)

Not a fault of this patch, but I think the code is calling close()
on an already closed file descriptor in cleanup_fail codepath, when
"goto cleanup_fail" is reached after xmmap() returned, e.g. when
oid_version() does not match hash_version, when we failed to read
the ToC.

Also, it is not clear why is it a dying offence if we do not find
packnames chunk, but it is just a "pretend as if we do not have this
midx file and everybody else is happy" when we failed to read the
ToC.
Taylor Blau Oct. 22, 2021, 3:04 a.m. UTC | #4
On Thu, Oct 21, 2021 at 09:16:27AM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> >  cleanup_fail:
> >  	free(m);
> >  	free(midx_name);
> > -	free(cf);
> > +	free_chunkfile(cf);
> >  	if (midx_map)
> >  		munmap(midx_map, midx_size);
> >  	if (0 <= fd)
>
> Not a fault of this patch, but I think the code is calling close()
> on an already closed file descriptor in cleanup_fail codepath, when
> "goto cleanup_fail" is reached after xmmap() returned, e.g. when
> oid_version() does not match hash_version, when we failed to read
> the ToC.
>
> Also, it is not clear why is it a dying offence if we do not find
> packnames chunk, but it is just a "pretend as if we do not have this
> midx file and everybody else is happy" when we failed to read the
> ToC.

Yep, I agree with you on both of those. I would be happier to see fewer
die()s deep within midx.c. I'll start a list for myself of some
potential future cleanups in this area that don't involve memory leaks.

My hunch is that there's enough subtlty here that we shouldn't tie the
two (fixing leaks, and other general issues/tidiness in the MIDX code)
together. So I'll keep track of the latter separately and address those
in future series.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/midx.c b/midx.c
index 8433086ac1..36e4754767 100644
--- a/midx.c
+++ b/midx.c
@@ -179,12 +179,13 @@  struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local
 	trace2_data_intmax("midx", the_repository, "load/num_packs", m->num_packs);
 	trace2_data_intmax("midx", the_repository, "load/num_objects", m->num_objects);

+	free_chunkfile(cf);
 	return m;

 cleanup_fail:
 	free(m);
 	free(midx_name);
-	free(cf);
+	free_chunkfile(cf);
 	if (midx_map)
 		munmap(midx_map, midx_size);
 	if (0 <= fd)