diff mbox series

[v2,01/22] remote: plug memory leak when aliasing URLs

Message ID 2afa51f9ffedfa6dab51c9515f695ddfe0a9a4f9.1723121979.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series Memory leak fixes (pt.4) | expand

Commit Message

Patrick Steinhardt Aug. 8, 2024, 1:04 p.m. UTC
When we have a `url.*.insteadOf` configuration, then we end up aliasing
URLs when populating remotes. One place where this happens is in
`alias_all_urls()`, where we loop through all remotes and then alias
each of their URLs. The actual aliasing logic is then contained in
`alias_url()`, which returns an allocated string that contains the new
URL. This URL replaces the old URL that we have in the strvec that
contanis all remote URLs.

We replace the remote URLs via `strvec_replace()`, which does not hand
over ownership of the new string to the vector. Still, we didn't free
the aliased URL and thus have a memory leak here. Fix it by freeing the
aliased string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 remote.c                 | 2 ++
 t/t0210-trace2-normal.sh | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

Comments

Karthik Nayak Aug. 12, 2024, 8:27 a.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> When we have a `url.*.insteadOf` configuration, then we end up aliasing
> URLs when populating remotes. One place where this happens is in
> `alias_all_urls()`, where we loop through all remotes and then alias
> each of their URLs. The actual aliasing logic is then contained in
> `alias_url()`, which returns an allocated string that contains the new
> URL. This URL replaces the old URL that we have in the strvec that
> contanis all remote URLs.
>

s/contanis/contains

[snip]
Taylor Blau Aug. 12, 2024, 2:08 p.m. UTC | #2
On Thu, Aug 08, 2024 at 03:04:33PM +0200, Patrick Steinhardt wrote:
> When we have a `url.*.insteadOf` configuration, then we end up aliasing
> URLs when populating remotes. One place where this happens is in
> `alias_all_urls()`, where we loop through all remotes and then alias
> each of their URLs. The actual aliasing logic is then contained in
> `alias_url()`, which returns an allocated string that contains the new
> URL. This URL replaces the old URL that we have in the strvec that
> contanis all remote URLs.
>
> We replace the remote URLs via `strvec_replace()`, which does not hand
> over ownership of the new string to the vector. Still, we didn't free
> the aliased URL and thus have a memory leak here. Fix it by freeing the
> aliased string.

Thanks for the detailed explanation here.

> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  remote.c                 | 2 ++
>  t/t0210-trace2-normal.sh | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/remote.c b/remote.c
> index f43cf5e7a4..3b898edd23 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -499,6 +499,7 @@ static void alias_all_urls(struct remote_state *remote_state)
>  			if (alias)
>  				strvec_replace(&remote_state->remotes[i]->pushurl,
>  					       j, alias);
> +			free(alias);
>  		}
>  		add_pushurl_aliases = remote_state->remotes[i]->pushurl.nr == 0;
>  		for (j = 0; j < remote_state->remotes[i]->url.nr; j++) {
> @@ -512,6 +513,7 @@ static void alias_all_urls(struct remote_state *remote_state)
>  			if (alias)
>  				strvec_replace(&remote_state->remotes[i]->url,
>  					       j, alias);
> +			free(alias);
>  		}
>  	}
>  }

These both make sense to me, since alias_url() allocates the string it
returns via xstrfmt(), so having the caller free it makes sense.

I was wondering if there was a nice way to neaten up these two call
paths that both call alias_url(), check for NULL, call strbuf_replace(),
and then free the result. But I think the result here would be pretty
awkward from my attempts at it, so I think this patch looks good as-is.

Thanks,
Taylor
Jeff King Aug. 12, 2024, 2:37 p.m. UTC | #3
On Thu, Aug 08, 2024 at 03:04:33PM +0200, Patrick Steinhardt wrote:

> When we have a `url.*.insteadOf` configuration, then we end up aliasing
> URLs when populating remotes. One place where this happens is in
> `alias_all_urls()`, where we loop through all remotes and then alias
> each of their URLs. The actual aliasing logic is then contained in
> `alias_url()`, which returns an allocated string that contains the new
> URL. This URL replaces the old URL that we have in the strvec that
> contanis all remote URLs.
> 
> We replace the remote URLs via `strvec_replace()`, which does not hand
> over ownership of the new string to the vector. Still, we didn't free
> the aliased URL and thus have a memory leak here. Fix it by freeing the
> aliased string.

Thanks, this one is my fault. When I replaced the open-coded replacement
in 8e804415fd (remote: use strvecs to store remote url/pushurl,
2024-06-14), for some reason I thought that strvec_replace() would take
ownership of the pointer. We could make a "_nodup()" variant, but it is
probably not worth the extra API complexity.

Curiously, these are the only calls for strvec_replace(). You added it
in 11ce77b5cc (strvec: add functions to replace and remove strings,
2024-05-27) but I don't see them used in any iteration of that patch
series. So yet another option is to change the semantics of
strvec_replace(), but I think that is an even worse idea. ;)

-Peff
Patrick Steinhardt Aug. 13, 2024, 6:34 a.m. UTC | #4
On Mon, Aug 12, 2024 at 10:37:10AM -0400, Jeff King wrote:
> On Thu, Aug 08, 2024 at 03:04:33PM +0200, Patrick Steinhardt wrote:
> 
> > When we have a `url.*.insteadOf` configuration, then we end up aliasing
> > URLs when populating remotes. One place where this happens is in
> > `alias_all_urls()`, where we loop through all remotes and then alias
> > each of their URLs. The actual aliasing logic is then contained in
> > `alias_url()`, which returns an allocated string that contains the new
> > URL. This URL replaces the old URL that we have in the strvec that
> > contanis all remote URLs.
> > 
> > We replace the remote URLs via `strvec_replace()`, which does not hand
> > over ownership of the new string to the vector. Still, we didn't free
> > the aliased URL and thus have a memory leak here. Fix it by freeing the
> > aliased string.
> 
> Thanks, this one is my fault. When I replaced the open-coded replacement
> in 8e804415fd (remote: use strvecs to store remote url/pushurl,
> 2024-06-14), for some reason I thought that strvec_replace() would take
> ownership of the pointer. We could make a "_nodup()" variant, but it is
> probably not worth the extra API complexity.
> 
> Curiously, these are the only calls for strvec_replace(). You added it
> in 11ce77b5cc (strvec: add functions to replace and remove strings,
> 2024-05-27) but I don't see them used in any iteration of that patch
> series. So yet another option is to change the semantics of
> strvec_replace(), but I think that is an even worse idea. ;)

Oh, interesting. I certainly wanted to use it back then, but guess that
later iterations removed those callsites again. In any case, it is being
used now, so at least it isn't dead code :)

Patrick
diff mbox series

Patch

diff --git a/remote.c b/remote.c
index f43cf5e7a4..3b898edd23 100644
--- a/remote.c
+++ b/remote.c
@@ -499,6 +499,7 @@  static void alias_all_urls(struct remote_state *remote_state)
 			if (alias)
 				strvec_replace(&remote_state->remotes[i]->pushurl,
 					       j, alias);
+			free(alias);
 		}
 		add_pushurl_aliases = remote_state->remotes[i]->pushurl.nr == 0;
 		for (j = 0; j < remote_state->remotes[i]->url.nr; j++) {
@@ -512,6 +513,7 @@  static void alias_all_urls(struct remote_state *remote_state)
 			if (alias)
 				strvec_replace(&remote_state->remotes[i]->url,
 					       j, alias);
+			free(alias);
 		}
 	}
 }
diff --git a/t/t0210-trace2-normal.sh b/t/t0210-trace2-normal.sh
index c312657a12..b9adc94aab 100755
--- a/t/t0210-trace2-normal.sh
+++ b/t/t0210-trace2-normal.sh
@@ -2,7 +2,7 @@ 
 
 test_description='test trace2 facility (normal target)'
 
-TEST_PASSES_SANITIZE_LEAK=false
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Turn off any inherited trace2 settings for this test.