diff mbox series

[10/11] relative_url(): fix incorrect condition

Message ID 0bf70e65d2c9e187203a77088ff0f7d18510caca.1655336146.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit c918f5c1ab0c4dec916b747916236ca0d3655be5
Headers show
Series Coverity fixes | expand

Commit Message

Johannes Schindelin June 15, 2022, 11:35 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

In 63e95beb085c (submodule: port resolve_relative_url from shell to C,
2016-04-15), we added a loop over `url` where we are looking for `../`
or `./` components.

The loop condition we used is the pointer `url` itself, which is clearly
not what we wanted.

Pointed out by Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 remote.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Junio C Hamano June 16, 2022, 5:02 a.m. UTC | #1
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> In 63e95beb085c (submodule: port resolve_relative_url from shell to C,
> 2016-04-15), we added a loop over `url` where we are looking for `../`
> or `./` components.
>
> The loop condition we used is the pointer `url` itself, which is clearly
> not what we wanted.

8/11, 9/11, and 10/11 are trivially correct.  Will queue.

Thanks.


>
> Pointed out by Coverity.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  remote.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/remote.c b/remote.c
> index 9b9bbfe80ec..bded6acbfe8 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -2846,7 +2846,7 @@ char *relative_url(const char *remote_url, const char *url,
>  	 * When the url starts with '../', remove that and the
>  	 * last directory in remoteurl.
>  	 */
> -	while (url) {
> +	while (*url) {
>  		if (starts_with_dot_dot_slash_native(url)) {
>  			url += 3;
>  			colonsep |= chop_last_dir(&remoteurl, is_relative);
Ævar Arnfjörð Bjarmason June 16, 2022, 1:09 p.m. UTC | #2
On Wed, Jun 15 2022, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> In 63e95beb085c (submodule: port resolve_relative_url from shell to C,
> 2016-04-15), we added a loop over `url` where we are looking for `../`
> or `./` components.
>
> The loop condition we used is the pointer `url` itself, which is clearly
> not what we wanted.

Clearly, but having looked at this I think it's useful to note that
coverity must be (I don't know what it actually emitted) be complaining
about the "url" condition being useless, i.e. it's the same as a "while
(1)", not that we run off the end of the string.

I.e. due to the way those start_with*() functions work we would abort
early if we were running off th end of the string.

> Pointed out by Coverity.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  remote.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/remote.c b/remote.c
> index 9b9bbfe80ec..bded6acbfe8 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -2846,7 +2846,7 @@ char *relative_url(const char *remote_url, const char *url,
>  	 * When the url starts with '../', remove that and the
>  	 * last directory in remoteurl.
>  	 */
> -	while (url) {
> +	while (*url) {
>  		if (starts_with_dot_dot_slash_native(url)) {
>  			url += 3;
>  			colonsep |= chop_last_dir(&remoteurl, is_relative);

Which I tested with this:
	
	diff --git a/remote.c b/remote.c
	index 9b9bbfe80ec..e049bbb791c 100644
	--- a/remote.c
	+++ b/remote.c
	@@ -2846,14 +2846,17 @@ char *relative_url(const char *remote_url, const char *url,
	 	 * When the url starts with '../', remove that and the
	 	 * last directory in remoteurl.
	 	 */
	-	while (url) {
	+	while (1) {
	 		if (starts_with_dot_dot_slash_native(url)) {
	 			url += 3;
	 			colonsep |= chop_last_dir(&remoteurl, is_relative);
	-		} else if (starts_with_dot_slash_native(url))
	+		} else if (starts_with_dot_slash_native(url)) {
	 			url += 2;
	-		else
	-			break;
	+		} else if (!*url) {
	+			BUG("ran off the end of our url?");
	+		} else {
	+			break; 
	+		}
	 	}
	 	strbuf_reset(&sb);
	 	strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url);
	
Which will fail e.g. on this test in t3420-rebase-autostash.sh:

	+ git submodule add ./ sub
	BUG: remote.c:2856: ran off the end of our url?
	Aborted

I worried a bit about this since we released this with v2.9.0, so in all
this time we haven't been infinitely looping on this case, or at least
haven't had any reports about that.

So if we hadn't been catching this in starts_with_*() I wouldn't be
confident that this was the correct fix, yes it's almost certainly not
what not what was intended, but if we change it to that are we confident
that the side-effects are going to be what we want?

But in this case I'm pretty sure that the behavior before/after will be
the same, and that the only change will be to make coverity happier, and
the code less confusing.
Junio C Hamano June 16, 2022, 4:41 p.m. UTC | #3
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> In 63e95beb085c (submodule: port resolve_relative_url from shell to C,
> 2016-04-15), we added a loop over `url` where we are looking for `../`
> or `./` components.
>
> The loop condition we used is the pointer `url` itself, which is clearly
> not what we wanted.
>
> Pointed out by Coverity.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  remote.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/remote.c b/remote.c
> index 9b9bbfe80ec..bded6acbfe8 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -2846,7 +2846,7 @@ char *relative_url(const char *remote_url, const char *url,
>  	 * When the url starts with '../', remove that and the
>  	 * last directory in remoteurl.
>  	 */
> -	while (url) {
> +	while (*url) {
>  		if (starts_with_dot_dot_slash_native(url)) {
>  			url += 3;
>  			colonsep |= chop_last_dir(&remoteurl, is_relative);
		} else if (starts_with_dot_slash_native(url))
			url += 2;
		else
			break;
	}

Looking at the "problematic" code again, the original is bad in the
sense that it is utterly misleading, but not quite "incorrect", I
would think.

In other words, what the original wanted to write is an infinite
loop that uses an explicit "break" to terminate the iteration, so we
should have written "while (1)" in the first place.

We know that earlier part of the function already depends on url
being not NULL, and nothing in the loop makes it NULL.  *url that is
NUL would make the two starts_with_frotz() tests fail and causes the
loop to terminate by hitting "break", so while it is correct to
check *url, it is wasting cycles.

But this patch is not wrong per-se, so I still have it in my tree
;-)

Thanks.
Junio C Hamano June 16, 2022, 5:55 p.m. UTC | #4
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> -	while (url) {
>> +	while (*url) {
>>  		if (starts_with_dot_dot_slash_native(url)) {
>>  			url += 3;
>>  			colonsep |= chop_last_dir(&remoteurl, is_relative);
>
> Which I tested with this:
> 	
> 	diff --git a/remote.c b/remote.c
> 	index 9b9bbfe80ec..e049bbb791c 100644
> 	--- a/remote.c
> 	+++ b/remote.c
> 	@@ -2846,14 +2846,17 @@ char *relative_url(const char *remote_url, const char *url,
> 	 	 * When the url starts with '../', remove that and the
> 	 	 * last directory in remoteurl.
> 	 	 */
> 	-	while (url) {
> 	+	while (1) {
> 	 		if (starts_with_dot_dot_slash_native(url)) {
> 	 			url += 3;
> 	 			colonsep |= chop_last_dir(&remoteurl, is_relative);
> 	-		} else if (starts_with_dot_slash_native(url))
> 	+		} else if (starts_with_dot_slash_native(url)) {
> 	 			url += 2;
> 	-		else
> 	-			break;
> 	+		} else if (!*url) {
> 	+			BUG("ran off the end of our url?");
> 	+		} else {
> 	+			break; 
> 	+		}
> 	 	}
> 	 	strbuf_reset(&sb);
> 	 	strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url);
> 	
> Which will fail e.g. on this test in t3420-rebase-autostash.sh:
>
> 	+ git submodule add ./ sub
> 	BUG: remote.c:2856: ran off the end of our url?
> 	Aborted

Sorry, but I am confused.  I do not see the point of that test
before the BUG.

I agree that this loop over url is "we want to loop forever, and
stop immediately when url does not start with ../ or ./" infinite
loop, and it is better written with "while(1)".  If you do not make
any other changes that confuses me, we'll get this:

-	while (*url) {
+	while (1) {
		if (starts_with_dot_dot_slash_native(url)) {
			url += 3;
			colonsep |= chop_last_dir(&remoteurl, is_relative);
		} else if (starts_with_dot_slash_native(url))
			url += 2;
		else
			break;
	}

We know url upon entry is not NULL.  starts_with_dot_*() ends up
calling dir.c:path_match_flags() and the first thing it does is to
return if the byte is not '.'.  So the difference is whether you
leave the loop with the "while" condition, or you fail the two
starts_with_dot_*() calls and hit the "break" at the end in the
loop.

IOW, running of the end of the URL is not a BUG, is it?
diff mbox series

Patch

diff --git a/remote.c b/remote.c
index 9b9bbfe80ec..bded6acbfe8 100644
--- a/remote.c
+++ b/remote.c
@@ -2846,7 +2846,7 @@  char *relative_url(const char *remote_url, const char *url,
 	 * When the url starts with '../', remove that and the
 	 * last directory in remoteurl.
 	 */
-	while (url) {
+	while (*url) {
 		if (starts_with_dot_dot_slash_native(url)) {
 			url += 3;
 			colonsep |= chop_last_dir(&remoteurl, is_relative);