diff mbox series

repository: prevent memory leak when releasing ref stores

Message ID pull.1758.git.git.1722855364436.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series repository: prevent memory leak when releasing ref stores | expand

Commit Message

Sven Strickroth Aug. 5, 2024, 10:56 a.m. UTC
From: Sven Strickroth <email@cs-ware.de>

`ref_store_release` does not free the ref_store allocated in
`ref_store_init`.

Signed-off-by: Sven Strickroth <email@cs-ware.de>
---
    repository: prevent memory leak when releasing ref stores

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1758%2Fcsware%2Frepository-memory-leak-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1758/csware/repository-memory-leak-v1
Pull-Request: https://github.com/git/git/pull/1758

 repository.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)


base-commit: e559c4bf1a306cf5814418d318cc0fea070da3c7

Comments

Sven Strickroth Aug. 5, 2024, 3:50 p.m. UTC | #1
Am 05.08.2024 um 12:56 schrieb Sven Strickroth via GitGitGadget:
> -	strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e)
> +	strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e) {
>   		ref_store_release(e->value);
> +		free(e->value);
> +	}
>   	strmap_clear(&repo->submodule_ref_stores, 1);

After further checking this does not seem to be necessary. The ref 
stores are already free'd in strmap_clear.
Junio C Hamano Aug. 5, 2024, 4:28 p.m. UTC | #2
"Sven Strickroth via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Sven Strickroth <email@cs-ware.de>
>
> `ref_store_release` does not free the ref_store allocated in
> `ref_store_init`.
>
> Signed-off-by: Sven Strickroth <email@cs-ware.de>
> ---

This may certainly plug the two leaking callers, but stepping back a
bit and looking at other existing calls to ref_store_release(), I
wonder if many existing and more importantly future callers benefit
if ref_store_release() did the freeing of the surrounding shell, as
we can see in these existing calls:

refs.c:2851:	ref_store_release(new_refs);
refs.c-2852-	FREE_AND_NULL(new_refs);

refs.c:2890:	ref_store_release(old_refs);
refs.c-2891-	FREE_AND_NULL(old_refs);

refs.c:2904:		ref_store_release(new_refs);
refs.c-2905-		free(new_refs);

If we change the type of ref_store_release() to take a pointer to a
pointer to ref_store, so that the above callers can just become

	ref_store_release(&new_refs);

to release the resources and new_refs variable cleared, the
callsites in this patch can do the same.

However, I am fuzzy on the existing uses in the backend
implementation.  For example:

        static void files_ref_store_release(struct ref_store *ref_store)
        {
                struct files_ref_store *refs = files_downcast(ref_store, 0, "release");
                free_ref_cache(refs->loose);
                free(refs->gitcommondir);
                ref_store_release(refs->packed_ref_store);
        }

The packed-ref-store is "released" here, as part of "releasing" the
files-ref-store that uses it as a fallback backend.  The caller of
files_ref_store_release() is refs.c:ref_store_release()

        void ref_store_release(struct ref_store *ref_store)
        {
                ref_store->be->release(ref_store);
                free(ref_store->gitdir);
        }

So if you have a files based ref store, when you are done you'd be
calling ref_store_release() on it, releasing the resources held by
the files_ref_store structure, but I do not know who frees the
packed_ref_store allocated by files_ref_store_init().  Perhaps it is
already leaking?  If that is the case then an API update like I
suggested above would make even more sense to make it less likely
for such a leak to be added to the system in the future, I suspect.

I dunno.

Thanks.

>     repository: prevent memory leak when releasing ref stores
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1758%2Fcsware%2Frepository-memory-leak-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1758/csware/repository-memory-leak-v1
> Pull-Request: https://github.com/git/git/pull/1758
>
>  repository.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/repository.c b/repository.c
> index 9825a308993..46f1eadfe95 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -366,12 +366,16 @@ void repo_clear(struct repository *repo)
>  		FREE_AND_NULL(repo->remote_state);
>  	}
>  
> -	strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e)
> +	strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e) {
>  		ref_store_release(e->value);
> +		free(e->value);
> +	}
>  	strmap_clear(&repo->submodule_ref_stores, 1);
>  
> -	strmap_for_each_entry(&repo->worktree_ref_stores, &iter, e)
> +	strmap_for_each_entry(&repo->worktree_ref_stores, &iter, e) {
>  		ref_store_release(e->value);
> +		free(e->value);
> +	}
>  	strmap_clear(&repo->worktree_ref_stores, 1);
>  
>  	repo_clear_path_cache(&repo->cached_paths);
>
> base-commit: e559c4bf1a306cf5814418d318cc0fea070da3c7
Junio C Hamano Aug. 5, 2024, 5:24 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> However, I am fuzzy on the existing uses in the backend
> implementation.  For example:
>
>         static void files_ref_store_release(struct ref_store *ref_store)
>         {
>                 struct files_ref_store *refs = files_downcast(ref_store, 0, "release");
>                 free_ref_cache(refs->loose);
>                 free(refs->gitcommondir);
>                 ref_store_release(refs->packed_ref_store);
>         }
>
> The packed-ref-store is "released" here, as part of "releasing" the
> files-ref-store that uses it as a fallback backend.  The caller of
> files_ref_store_release() is refs.c:ref_store_release()
>
>         void ref_store_release(struct ref_store *ref_store)
>         {
>                 ref_store->be->release(ref_store);
>                 free(ref_store->gitdir);
>         }
>
> So if you have a files based ref store, when you are done you'd be
> calling ref_store_release() on it, releasing the resources held by
> the files_ref_store structure, but I do not know who frees the
> packed_ref_store allocated by files_ref_store_init().  Perhaps it is
> already leaking?  If that is the case then an API update like I
> suggested above would make even more sense to make it less likely
> for such a leak to be added to the system in the future, I suspect.

Ahh, that was the leak that you plugged in a separate patch.

So it does point us in the other direction to redefine _release with
a different behaviour that releases the resource held by the
structure, and frees the structure itself.

Something along the following line (caution: totally untested) that
allows your two patches to become empty, and also allows a few
callers to lose their existing explicit free()s immediately after
they call _release(), perhaps?

If this were to become a real patch, I think debug backend should
learn to use the same _downcast() to become more like the real ones
before it happens in a preliminary clean-up patch.

 refs.h                  |  5 +++--
 refs.c                  | 19 ++++++++-----------
 refs/refs-internal.h    |  2 +-
 refs/files-backend.c    |  6 +++---
 refs/packed-backend.c   |  5 +++--
 refs/reftable-backend.c |  6 +++---
 refs/debug.c            |  6 +++---
 7 files changed, 24 insertions(+), 25 deletions(-)

diff --git c/refs.h w/refs.h
index b3e39bc257..e4f092f6ac 100644
--- c/refs.h
+++ w/refs.h
@@ -119,9 +119,10 @@ int is_branch(const char *refname);
 int ref_store_create_on_disk(struct ref_store *refs, int flags, struct strbuf *err);
 
 /*
- * Release all memory and resources associated with the ref store.
+ * Release all memory and resources associated with the ref store, including
+ * the ref_store itself.
  */
-void ref_store_release(struct ref_store *ref_store);
+void ref_store_release(struct ref_store **ref_store);
 
 /*
  * Remove the ref store from disk. This deletes all associated data.
diff --git c/refs.c w/refs.c
index 915aeb4d1d..cb76a5d4bd 100644
--- c/refs.c
+++ w/refs.c
@@ -1936,10 +1936,11 @@ static struct ref_store *ref_store_init(struct repository *repo,
 	return refs;
 }
 
-void ref_store_release(struct ref_store *ref_store)
+void ref_store_release(struct ref_store **ref_store)
 {
-	ref_store->be->release(ref_store);
-	free(ref_store->gitdir);
+	(*ref_store)->be->release(ref_store);
+	free((*ref_store)->gitdir);
+	FREE_AND_NULL(*ref_store);
 }
 
 struct ref_store *get_main_ref_store(struct repository *r)
@@ -2848,8 +2849,7 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	 * be closed. This is required for platforms like Cygwin, where
 	 * renaming an open file results in EPERM.
 	 */
-	ref_store_release(new_refs);
-	FREE_AND_NULL(new_refs);
+	ref_store_release(&new_refs);
 
 	/*
 	 * Until now we were in the non-destructive phase, where we only
@@ -2887,8 +2887,7 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	 * make sure to lazily re-initialize the repository's ref store with
 	 * the new format.
 	 */
-	ref_store_release(old_refs);
-	FREE_AND_NULL(old_refs);
+	ref_store_release(&old_refs);
 	repo->refs_private = NULL;
 
 	ret = 0;
@@ -2900,10 +2899,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 			    new_gitdir.buf);
 	}
 
-	if (new_refs) {
-		ref_store_release(new_refs);
-		free(new_refs);
-	}
+	if (new_refs)
+		ref_store_release(&new_refs);
 	ref_transaction_free(transaction);
 	strbuf_release(&new_gitdir);
 	return ret;
diff --git c/refs/refs-internal.h w/refs/refs-internal.h
index fa975d69aa..2ba2372acb 100644
--- c/refs/refs-internal.h
+++ w/refs/refs-internal.h
@@ -511,7 +511,7 @@ typedef struct ref_store *ref_store_init_fn(struct repository *repo,
 /*
  * Release all memory and resources associated with the ref store.
  */
-typedef void ref_store_release_fn(struct ref_store *refs);
+typedef void ref_store_release_fn(struct ref_store **refs);
 
 typedef int ref_store_create_on_disk_fn(struct ref_store *refs,
 					int flags,
diff --git c/refs/files-backend.c w/refs/files-backend.c
index aa52d9be7c..8ebb1681ac 100644
--- c/refs/files-backend.c
+++ w/refs/files-backend.c
@@ -151,12 +151,12 @@ static struct files_ref_store *files_downcast(struct ref_store *ref_store,
 	return refs;
 }
 
-static void files_ref_store_release(struct ref_store *ref_store)
+static void files_ref_store_release(struct ref_store **ref_store)
 {
-	struct files_ref_store *refs = files_downcast(ref_store, 0, "release");
+	struct files_ref_store *refs = files_downcast(*ref_store, 0, "release");
 	free_ref_cache(refs->loose);
 	free(refs->gitcommondir);
-	ref_store_release(refs->packed_ref_store);
+	ref_store_release(&refs->packed_ref_store);
 }
 
 static void files_reflog_path(struct files_ref_store *refs,
diff --git c/refs/packed-backend.c w/refs/packed-backend.c
index a0666407cd..8321a4cc17 100644
--- c/refs/packed-backend.c
+++ w/refs/packed-backend.c
@@ -260,13 +260,14 @@ static void clear_snapshot(struct packed_ref_store *refs)
 	}
 }
 
-static void packed_ref_store_release(struct ref_store *ref_store)
+static void packed_ref_store_release(struct ref_store **ref_store)
 {
-	struct packed_ref_store *refs = packed_downcast(ref_store, 0, "release");
+	struct packed_ref_store *refs = packed_downcast(*ref_store, 0, "release");
 	clear_snapshot(refs);
 	rollback_lock_file(&refs->lock);
 	delete_tempfile(&refs->tempfile);
 	free(refs->path);
+	FREE_AND_NULL(*ref_store);
 }
 
 static NORETURN void die_unterminated_line(const char *path,
diff --git c/refs/reftable-backend.c w/refs/reftable-backend.c
index fbe74c239d..0af010acfb 100644
--- c/refs/reftable-backend.c
+++ w/refs/reftable-backend.c
@@ -337,9 +337,9 @@ static struct ref_store *reftable_be_init(struct repository *repo,
 	return &refs->base;
 }
 
-static void reftable_be_release(struct ref_store *ref_store)
+static void reftable_be_release(struct ref_store **ref_store)
 {
-	struct reftable_ref_store *refs = reftable_be_downcast(ref_store, 0, "release");
+	struct reftable_ref_store *refs = reftable_be_downcast(*ref_store, 0, "release");
 	struct strmap_entry *entry;
 	struct hashmap_iter iter;
 
@@ -400,7 +400,7 @@ static int reftable_be_remove_on_disk(struct ref_store *ref_store,
 	 * required so that the "tables.list" file is not open anymore, which
 	 * would otherwise make it impossible to remove the file on Windows.
 	 */
-	reftable_be_release(ref_store);
+	reftable_be_release(&ref_store);
 
 	strbuf_addf(&sb, "%s/reftable", refs->base.gitdir);
 	if (remove_dir_recursively(&sb, 0) < 0) {
diff --git c/refs/debug.c w/refs/debug.c
index 547d9245b9..be6230045c 100644
--- c/refs/debug.c
+++ w/refs/debug.c
@@ -33,10 +33,10 @@ struct ref_store *maybe_debug_wrap_ref_store(const char *gitdir, struct ref_stor
 	return (struct ref_store *)res;
 }
 
-static void debug_release(struct ref_store *refs)
+static void debug_release(struct ref_store **refs)
 {
-	struct debug_ref_store *drefs = (struct debug_ref_store *)refs;
-	drefs->refs->be->release(drefs->refs);
+	struct debug_ref_store *drefs = *(struct debug_ref_store **)refs;
+	drefs->refs->be->release(&drefs->refs);
 	trace_printf_key(&trace_refs, "release\n");
 }
Junio C Hamano Aug. 5, 2024, 5:42 p.m. UTC | #4
Sven Strickroth <email@cs-ware.de> writes:

> Am 05.08.2024 um 12:56 schrieb Sven Strickroth via GitGitGadget:
>> -	strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e)
>> +	strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e) {
>>   		ref_store_release(e->value);
>> +		free(e->value);
>> +	}
>>   	strmap_clear(&repo->submodule_ref_stores, 1);
>
> After further checking this does not seem to be necessary. The ref
> stores are already free'd in strmap_clear.

Is it "not necessary" or "actively harmful"?  It sounds like the
latter?
Patrick Steinhardt Aug. 6, 2024, 6:20 a.m. UTC | #5
On Mon, Aug 05, 2024 at 10:24:10AM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > However, I am fuzzy on the existing uses in the backend
> > implementation.  For example:
> >
> >         static void files_ref_store_release(struct ref_store *ref_store)
> >         {
> >                 struct files_ref_store *refs = files_downcast(ref_store, 0, "release");
> >                 free_ref_cache(refs->loose);
> >                 free(refs->gitcommondir);
> >                 ref_store_release(refs->packed_ref_store);
> >         }
> >
> > The packed-ref-store is "released" here, as part of "releasing" the
> > files-ref-store that uses it as a fallback backend.  The caller of
> > files_ref_store_release() is refs.c:ref_store_release()
> >
> >         void ref_store_release(struct ref_store *ref_store)
> >         {
> >                 ref_store->be->release(ref_store);
> >                 free(ref_store->gitdir);
> >         }
> >
> > So if you have a files based ref store, when you are done you'd be
> > calling ref_store_release() on it, releasing the resources held by
> > the files_ref_store structure, but I do not know who frees the
> > packed_ref_store allocated by files_ref_store_init().  Perhaps it is
> > already leaking?  If that is the case then an API update like I
> > suggested above would make even more sense to make it less likely
> > for such a leak to be added to the system in the future, I suspect.
> 
> Ahh, that was the leak that you plugged in a separate patch.
> 
> So it does point us in the other direction to redefine _release with
> a different behaviour that releases the resource held by the
> structure, and frees the structure itself.
> 
> Something along the following line (caution: totally untested) that
> allows your two patches to become empty, and also allows a few
> callers to lose their existing explicit free()s immediately after
> they call _release(), perhaps?

I don't really know whether it's worth the churn, but if somebody wants
to pull through with this I'm game :) But: if we are going to do this,
we should rename the function to be called `ref_store_free()` instead of
`ref_store_release()` according to our recent coding style update :)

> If this were to become a real patch, I think debug backend should
> learn to use the same _downcast() to become more like the real ones
> before it happens in a preliminary clean-up patch.

That certainly wouldn't hurt, yeah.

Patrick
Junio C Hamano Aug. 6, 2024, 3:44 p.m. UTC | #6
Patrick Steinhardt <ps@pks.im> writes:

>> Something along the following line (caution: totally untested) that
>> allows your two patches to become empty, and also allows a few
>> callers to lose their existing explicit free()s immediately after
>> they call _release(), perhaps?
>
> I don't really know whether it's worth the churn, but if somebody wants
> to pull through with this I'm game :) But: if we are going to do this,
> we should rename the function to be called `ref_store_free()` instead of
> `ref_store_release()` according to our recent coding style update :)

Yes, we had "what is release, clear, and free?" discussion recently.

>> If this were to become a real patch, I think debug backend should
>> learn to use the same _downcast() to become more like the real ones
>> before it happens in a preliminary clean-up patch.
>
> That certainly wouldn't hurt, yeah.

I am not short of (other) things to do, and expect that I will not
touching this for a while, but in case somebody finds #leftoverbits
here, I'll leave a note here.  

There are "hidden" freeing that we have to adjust, if we were to
follow through this approach.  For example, those free()'s added in
the patch in the message that started the thread are introduing
double free---after the strmap_for_each_entry() loop, a
strmap_clear() callis done with free_values=1.  If we freed inside
ref_store_release(), we'd need to adjust the call to strmap_clear()
to pass free_values=0 to compensate.
diff mbox series

Patch

diff --git a/repository.c b/repository.c
index 9825a308993..46f1eadfe95 100644
--- a/repository.c
+++ b/repository.c
@@ -366,12 +366,16 @@  void repo_clear(struct repository *repo)
 		FREE_AND_NULL(repo->remote_state);
 	}
 
-	strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e)
+	strmap_for_each_entry(&repo->submodule_ref_stores, &iter, e) {
 		ref_store_release(e->value);
+		free(e->value);
+	}
 	strmap_clear(&repo->submodule_ref_stores, 1);
 
-	strmap_for_each_entry(&repo->worktree_ref_stores, &iter, e)
+	strmap_for_each_entry(&repo->worktree_ref_stores, &iter, e) {
 		ref_store_release(e->value);
+		free(e->value);
+	}
 	strmap_clear(&repo->worktree_ref_stores, 1);
 
 	repo_clear_path_cache(&repo->cached_paths);