diff mbox series

http.c: clear the 'finished' member once we are done with it

Message ID xmqqczgqjr8y.fsf_-_@gitster.g (mailing list archive)
State New, archived
Headers show
Series http.c: clear the 'finished' member once we are done with it | expand

Commit Message

Junio C Hamano May 6, 2022, 9:17 p.m. UTC
In http.c, the run_active_slot() function allows the given "slot" to
make progress by calling step_active_slots() in a loop repeatedly,
and the loop is not left until the request held in the slot
completes.

Ages ago, we used to use the slot->in_use member to get out of the
loop, which misbehaved when the request in "slot" completes (at
which time, the result of the request is copied away from the slot,
and the in_use member is cleared, making the slot ready to be
reused), and the "slot" gets reused to service a different request
(at which time, the "slot" becomes in_use again, even though it is
for a different request).  The loop terminating condition mistakenly
thought that the original request has yet to be completed.

Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10)
fixed this issue, uses a separate "slot->finished" member that is
set in run_active_slot() to point to an on-stack variable, and the
code that completes the request in finish_active_slot() clears the
on-stack variable via the pointer to signal that the particular
request held by the slot has completed.  It also clears the in_use
member (as before that fix), so that the slot itself can safely be
reused for an unrelated request.

One thing that is not quite clean in this arrangement is that,
unless the slot gets reused, at which point the finished member is
reset to NULL, the member keeps the value of &finished, which
becomes a dangling pointer into the stack when run_active_slot()
returns.  Clear the finished member before the control leaves the
function, but make sure to limit it to the case where the pointer
still points at the on-stack variable of ours (the pointer may be
set to point at the on-stack variable of somebody else after the
slot gets reused, in which case we do not want to touch it).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * So, this has been sitting in my pile of random patches for a few
   weeks.

 http.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Carlo Marcelo Arenas Belón May 7, 2022, 5:40 a.m. UTC | #1
On Fri, May 06, 2022 at 02:17:01PM -0700, Junio C Hamano wrote:
> diff --git a/http.c b/http.c
> index 229da4d148..85437b1980 100644
> --- a/http.c
> +++ b/http.c
> @@ -1367,6 +1367,9 @@ void run_active_slot(struct active_request_slot *slot)
>  			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
>  		}
>  	}
> +
> +	if (slot->finished == &finished)
> +		slot->finished = NULL;

I am not completely sure yet (since I looked at it long ago and got
sidetracked) but I think this might be optimized out (at least by gcc12)
since it is technically UB, which is why it never "fixed" the warning.

the "correct" way to implement this would be to make "finished" a thread
local static, which is finally one good reason to support C99, but the
syntax to do so with Windows broke my first attempt at doing so and now
I can even find the code I used then which required a per platform macro
and was better looking than the following

Carlo
--- >8 ----
Date: Wed, 20 Apr 2022 23:25:55 -0700
Subject: [PATCH] http: avoid using out of scope pointers

baa7b67d091 (HTTP slot reuse fixes, 2006-03-10) introduces a way
to notify a curl thread that its slot is finished by using a pointer
to a stack variable from run_active_slot(), but then gcc 12 was
released and started rightfully to complain about it (-Wdangling-pointer).

Use instead a thread storage static variable which is safe to use
between threads since C99 and doesn't go out of scope, while being
functionally equivalent to the original code, and also remove the
workaround from 9c539d1027d (config.mak.dev: alternative workaround
to gcc 12 warning in http.c, 2022-04-15)

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
 config.mak.dev | 1 -
 http.c         | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/config.mak.dev b/config.mak.dev
index c3104f400b..335efd4620 100644
--- a/config.mak.dev
+++ b/config.mak.dev
@@ -68,7 +68,6 @@ endif
 # https://bugzilla.redhat.com/show_bug.cgi?id=2075786
 ifneq ($(filter gcc12,$(COMPILER_FEATURES)),)
 DEVELOPER_CFLAGS += -Wno-error=stringop-overread
-DEVELOPER_CFLAGS += -Wno-error=dangling-pointer
 endif
 
 GIT_TEST_PERL_FATAL_WARNINGS = YesPlease
diff --git a/http.c b/http.c
index 229da4d148..cb9acfca19 100644
--- a/http.c
+++ b/http.c
@@ -1327,7 +1327,7 @@ void run_active_slot(struct active_request_slot *slot)
 	fd_set excfds;
 	int max_fd;
 	struct timeval select_timeout;
-	int finished = 0;
+	static __thread int finished;
 
 	slot->finished = &finished;
 	while (!finished) {
Junio C Hamano May 7, 2022, 6:42 p.m. UTC | #2
Carlo Marcelo Arenas Belón <carenas@gmail.com> writes:

> On Fri, May 06, 2022 at 02:17:01PM -0700, Junio C Hamano wrote:
>> diff --git a/http.c b/http.c
>> index 229da4d148..85437b1980 100644
>> --- a/http.c
>> +++ b/http.c
>> @@ -1367,6 +1367,9 @@ void run_active_slot(struct active_request_slot *slot)
>>  			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
>>  		}
>>  	}
>> +
>> +	if (slot->finished == &finished)
>> +		slot->finished = NULL;
>
> I am not completely sure yet (since I looked at it long ago and got
> sidetracked) but I think this might be optimized out (at least by gcc12)
> since it is technically UB, which is why it never "fixed" the warning.

UB meaning "undefined behaviour"?  Which part is?  Taking the
address of an on-stack variable "finished"?  Comparing it with a
pointer that may or may not have been assigned/overwritten elsewhere
in a structure?  Not clearing the member in the struct unconditionally?

Puzzled.
Carlo Marcelo Arenas Belón May 7, 2022, 7:11 p.m. UTC | #3
On Sat, May 7, 2022 at 11:42 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Carlo Marcelo Arenas Belón <carenas@gmail.com> writes:
>
> > On Fri, May 06, 2022 at 02:17:01PM -0700, Junio C Hamano wrote:
> >> diff --git a/http.c b/http.c
> >> index 229da4d148..85437b1980 100644
> >> --- a/http.c
> >> +++ b/http.c
> >> @@ -1367,6 +1367,9 @@ void run_active_slot(struct active_request_slot *slot)
> >>                      select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
> >>              }
> >>      }
> >> +
> >> +    if (slot->finished == &finished)
> >> +            slot->finished = NULL;
> >
> > I am not completely sure yet (since I looked at it long ago and got
> > sidetracked) but I think this might be optimized out (at least by gcc12)
> > since it is technically UB, which is why it never "fixed" the warning.
>
> UB meaning "undefined behaviour"?  Which part is?  Taking the
> address of an on-stack variable "finished"?
> Comparing it with a
> pointer that may or may not have been assigned/overwritten elsewhere
> in a structure?

it is not very intuitive, but using a pointer to a variable that is
out of scope is UB, and in this case the value of slot->finished might
point to an address that is not in our own stack (because it came from
a different thread), hence undefined

Carlo
Johannes Schindelin May 23, 2022, 9:58 p.m. UTC | #4
Hi Junio,

On Fri, 6 May 2022, Junio C Hamano wrote:

> In http.c, the run_active_slot() function allows the given "slot" to
> make progress by calling step_active_slots() in a loop repeatedly,
> and the loop is not left until the request held in the slot
> completes.
>
> Ages ago, we used to use the slot->in_use member to get out of the
> loop, which misbehaved when the request in "slot" completes (at
> which time, the result of the request is copied away from the slot,
> and the in_use member is cleared, making the slot ready to be
> reused), and the "slot" gets reused to service a different request
> (at which time, the "slot" becomes in_use again, even though it is
> for a different request).  The loop terminating condition mistakenly
> thought that the original request has yet to be completed.
>
> Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10)
> fixed this issue, uses a separate "slot->finished" member that is
> set in run_active_slot() to point to an on-stack variable, and the
> code that completes the request in finish_active_slot() clears the
> on-stack variable via the pointer to signal that the particular
> request held by the slot has completed.  It also clears the in_use
> member (as before that fix), so that the slot itself can safely be
> reused for an unrelated request.
>
> One thing that is not quite clean in this arrangement is that,
> unless the slot gets reused, at which point the finished member is
> reset to NULL, the member keeps the value of &finished, which
> becomes a dangling pointer into the stack when run_active_slot()
> returns.  Clear the finished member before the control leaves the
> function, but make sure to limit it to the case where the pointer
> still points at the on-stack variable of ours (the pointer may be
> set to point at the on-stack variable of somebody else after the
> slot gets reused, in which case we do not want to touch it).
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>
>  * So, this has been sitting in my pile of random patches for a few
>    weeks.

I stumbled over the need for this while investigating the build failures
caused by upgrading Git for Windows' SDK's GCC to v12.x.

> diff --git a/http.c b/http.c
> index 229da4d148..85437b1980 100644
> --- a/http.c
> +++ b/http.c
> @@ -1367,6 +1367,9 @@ void run_active_slot(struct active_request_slot *slot)
>  			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
>  		}
>  	}
> +
> +	if (slot->finished == &finished)
> +		slot->finished = NULL;

First of all, I suspect that
https://github.com/git/git/blob/v2.36.1/http.c#L1207 makes sure that GCC's
complaint is not actually accurate: we always re-set `finished` to `NULL`
when getting an unused slot, so even if there is a left-over dangling
pointer, it is not actually used, ever.

But we need something to pacify GCC. Let's look at your patch.

The first thing to note is that this is not _quite_ thread-safe: between
checking the condition `slot->finished == &finished` and assigning
`slot->finished`, another thread could potentially have noticed that the
slot is not in use and overwritten the `finished` attribute, which would
then be set to `NULL` in this thread, in which case _that other_ thread's
`while (!finished)` loop would become an infinite loop.

Having said that, the time window is really narrow.

Besides, I suspect that we _already_ have an equivalent "offender" in
https://github.com/git/git/blob/v2.36.1/http.c#L1336: we look at `in_use`
there, assuming that it is either `1` if "our" request is still active, and
otherwise it is `0`. However, it might have turned to `0` _and_ to `1`
again in the meantime (but the `in_use` would now refer to _another_
request).

I am not quite sure how correct my reading of the situation is, so please
double-check my analysis.

If that analysis is correct, I would expect the correct solution to turn
`finished` into an attribute of the slot, and change its role to be a flag
that this slot is spoken for and cannot be re-used quite yet even if it is
not currently in use.

Something like this:

-- snip --
diff --git a/http-walker.c b/http-walker.c
index 910fae539b89..5cc369dea853 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -225,13 +225,9 @@ static void process_alternates_response(void *callback_data)
 					 alt_req->url->buf);
 			active_requests++;
 			slot->in_use = 1;
-			if (slot->finished != NULL)
-				(*slot->finished) = 0;
 			if (!start_active_slot(slot)) {
 				cdata->got_alternates = -1;
 				slot->in_use = 0;
-				if (slot->finished != NULL)
-					(*slot->finished) = 1;
 			}
 			return;
 		}
diff --git a/http.c b/http.c
index b08795715f8a..2d125132fb90 100644
--- a/http.c
+++ b/http.c
@@ -205,8 +205,7 @@ static void finish_active_slot(struct active_request_slot *slot)
 	closedown_active_slot(slot);
 	curl_easy_getinfo(slot->curl, CURLINFO_HTTP_CODE, &slot->http_code);

-	if (slot->finished != NULL)
-		(*slot->finished) = 1;
+	slot->in_use = 0;

 	/* Store slot results so they can be read after the slot is reused */
 	if (slot->results != NULL) {
@@ -1212,13 +1211,14 @@ struct active_request_slot *get_active_slot(void)
 			process_curl_messages();
 	}

-	while (slot != NULL && slot->in_use)
+	while (slot != NULL && (slot->in_use || slot->reserved_for_use))
 		slot = slot->next;

 	if (slot == NULL) {
 		newslot = xmalloc(sizeof(*newslot));
 		newslot->curl = NULL;
 		newslot->in_use = 0;
+		newslot->reserved_for_use = 0;
 		newslot->next = NULL;

 		slot = active_queue_head;
@@ -1240,7 +1240,6 @@ struct active_request_slot *get_active_slot(void)
 	active_requests++;
 	slot->in_use = 1;
 	slot->results = NULL;
-	slot->finished = NULL;
 	slot->callback_data = NULL;
 	slot->callback_func = NULL;
 	curl_easy_setopt(slot->curl, CURLOPT_COOKIEFILE, curl_cookie_file);
@@ -1332,7 +1331,7 @@ void fill_active_slots(void)
 	}

 	while (slot != NULL) {
-		if (!slot->in_use && slot->curl != NULL
+		if (!slot->in_use && !slot->reserved_for_use && slot->curl
 			&& curl_session_count > min_curl_sessions) {
 			curl_easy_cleanup(slot->curl);
 			slot->curl = NULL;
@@ -1363,10 +1362,9 @@ void run_active_slot(struct active_request_slot *slot)
 	fd_set excfds;
 	int max_fd;
 	struct timeval select_timeout;
-	int finished = 0;

-	slot->finished = &finished;
-	while (!finished) {
+	slot->reserved_for_use = 1;
+	while (slot->in_use) {
 		step_active_slots();

 		if (slot->in_use) {
@@ -1403,6 +1401,7 @@ void run_active_slot(struct active_request_slot *slot)
 			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
 		}
 	}
+	slot->reserved_for_use = 0;
 }

 static void release_active_slot(struct active_request_slot *slot)
diff --git a/http.h b/http.h
index df1590e53a45..3b2f6da570cd 100644
--- a/http.h
+++ b/http.h
@@ -22,9 +22,9 @@ struct slot_results {
 struct active_request_slot {
 	CURL *curl;
 	int in_use;
+	int reserved_for_use;
 	CURLcode curl_result;
 	long http_code;
-	int *finished;
 	struct slot_results *results;
 	void *callback_data;
 	void (*callback_func)(void *data);
-- snap --

I integrated this into a local branch that fixes the build with GCC v12.x
(required so that our CI/PR builds work again after Git for Windows' SDK
upgraded its GCC) and plan on contributing these patches in a bit.

Ciao,
Dscho

>  }
>
>  static void release_active_slot(struct active_request_slot *slot)
> --
> 2.36.1-200-gf89ea983ca
>
>
>
Junio C Hamano May 23, 2022, 10:58 p.m. UTC | #5
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> I stumbled over the need for this while investigating the build failures
> caused by upgrading Git for Windows' SDK's GCC to v12.x.
>
>> diff --git a/http.c b/http.c
>> index 229da4d148..85437b1980 100644
>> --- a/http.c
>> +++ b/http.c
>> @@ -1367,6 +1367,9 @@ void run_active_slot(struct active_request_slot *slot)
>>  			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
>>  		}
>>  	}
>> +
>> +	if (slot->finished == &finished)
>> +		slot->finished = NULL;
>
> First of all, I suspect that
> https://github.com/git/git/blob/v2.36.1/http.c#L1207 makes sure that GCC's
> complaint is not actually accurate: we always re-set `finished` to `NULL`
> when getting an unused slot, so even if there is a left-over dangling
> pointer, it is not actually used, ever.
>
> But we need something to pacify GCC. Let's look at your patch.
>
> The first thing to note is that this is not _quite_ thread-safe: between

Does this part of the code ever run multi-threaded?

> If that analysis is correct, I would expect the correct solution to turn
> `finished` into an attribute of the slot, and change its role to be a flag
> that this slot is spoken for and cannot be re-used quite yet even if it is
> not currently in use.

I have a feeling that we've mentioned that at least twice (perhaps
three times) in the recent past that it is in essense reverting what
the "finished" change baa7b67d (HTTP slot reuse fixes, 2006-03-10)
did.  We used to use the in-use bit of the slot as an indicator that
the slot dispatched by run_active_slot() has finished (i.e. the
in-use bit must be cleared when the request held in the struct is
fully done), but that broke when a slot we are looking at in
run_active_slot() is serviced (which makes in_use false), and then
another request reuses the slot (now no longer in_use), before the
control comes back to the loop.  "while (slot->in_use)" at the
beginning of the loop was still true, but the original request the
slot was being used for, the one that the run_active_slot() function
cares about, has completed.

So...
Junio C Hamano May 23, 2022, 11:36 p.m. UTC | #6
Junio C Hamano <gitster@pobox.com> writes:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>> I stumbled over the need for this while investigating the build failures
>> caused by upgrading Git for Windows' SDK's GCC to v12.x.
>>
>>> diff --git a/http.c b/http.c
>>> index 229da4d148..85437b1980 100644
>>> --- a/http.c
>>> +++ b/http.c
>>> @@ -1367,6 +1367,9 @@ void run_active_slot(struct active_request_slot *slot)
>>>  			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
>>>  		}
>>>  	}
>>> +
>>> +	if (slot->finished == &finished)
>>> +		slot->finished = NULL;
> ...
> I have a feeling that we've mentioned that at least twice (perhaps
> three times) in the recent past that it is in essense reverting what
> the "finished" change baa7b67d (HTTP slot reuse fixes, 2006-03-10)
> did.  We used to use the in-use bit of the slot as an indicator that
> the slot dispatched by run_active_slot() has finished (i.e. the
> in-use bit must be cleared when the request held in the struct is
> fully done), but that broke when a slot we are looking at in
> run_active_slot() is serviced (which makes in_use false), and then
> another request reuses the slot (now no longer in_use), before the
> control comes back to the loop.  "while (slot->in_use)" at the
> beginning of the loop was still true, but the original request the
> slot was being used for, the one that the run_active_slot() function
> cares about, has completed.

Given that the breakage we fixed in 2006 is about run_active_slot()
calling step_active_slots() repeatedly, during which this and other
requests in flight completes when curl_multi_perform() receives and
handles responses, and recursively ends up calling run_active_slot()
for _another_ request reusing the slot we are interested in in the
codepath in the above disccussion, I _think_ we do not have to
consider the case where slot->finished is pointing at somebody
else's finished variable on stack here.  Yes, while we repeatedly
call step_active_slots(), our request in the slot may complete, the
slot may be marked as unused, somebody else may reuse the slot,
marking it as in_use again and using slot->finished pointer to their
on-stack finsihed.  But that somebody else's invocation of
run_active_slot() will not give control back before their on-stack
finished indicates that their recursive call to step_active_slots()
completes their request.  So after they come back and we exit our
while() loop, either slot->finished points at our finished if slot
did not get reused, or it points at an unused part of the stack that
has long been rewound when we returned from the recursive call.  In
either case, slot->finished never points at an on-stack address of
an ongoing run_active_slot() call made by somebody else that the
guard I added (i.e. we must only clear it if it points our on-stack
"finished") was trying to protect against clobbering.

So, I guess an unconditional assignment of

	slot->finished = NULL;

there would be sufficient.
Johannes Schindelin May 23, 2022, 11:41 p.m. UTC | #7
Hi Junio,

On Mon, 23 May 2022, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > I stumbled over the need for this while investigating the build failures
> > caused by upgrading Git for Windows' SDK's GCC to v12.x.
> >
> >> diff --git a/http.c b/http.c
> >> index 229da4d148..85437b1980 100644
> >> --- a/http.c
> >> +++ b/http.c
> >> @@ -1367,6 +1367,9 @@ void run_active_slot(struct active_request_slot *slot)
> >>  			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
> >>  		}
> >>  	}
> >> +
> >> +	if (slot->finished == &finished)
> >> +		slot->finished = NULL;
> >
> > First of all, I suspect that
> > https://github.com/git/git/blob/v2.36.1/http.c#L1207 makes sure that GCC's
> > complaint is not actually accurate: we always re-set `finished` to `NULL`
> > when getting an unused slot, so even if there is a left-over dangling
> > pointer, it is not actually used, ever.
> >
> > But we need something to pacify GCC. Let's look at your patch.
> >
> > The first thing to note is that this is not _quite_ thread-safe: between
>
> Does this part of the code ever run multi-threaded?

It calls into cURL, which I suspect has a multi-threaded mode of
operation, and we do use some callbacks in `http-walker.c` and I was
worried that these callbacks could be called via cURL calls. That's why I
am concerned about thread-safety. I have looked for a while and not found
any way that code in `http.c` or `http-walker.c` could set `finished` by
way of a cURL callback, but there is a lot of code there and I could have
missed something crucial.

> > If that analysis is correct, I would expect the correct solution to turn
> > `finished` into an attribute of the slot, and change its role to be a flag
> > that this slot is spoken for and cannot be re-used quite yet even if it is
> > not currently in use.
>
> I have a feeling that we've mentioned that at least twice (perhaps
> three times) in the recent past that it is in essense reverting what
> the "finished" change baa7b67d (HTTP slot reuse fixes, 2006-03-10)
> did.  We used to use the in-use bit of the slot as an indicator that
> the slot dispatched by run_active_slot() has finished (i.e. the
> in-use bit must be cleared when the request held in the struct is
> fully done), but that broke when a slot we are looking at in
> run_active_slot() is serviced (which makes in_use false), and then
> another request reuses the slot (now no longer in_use), before the
> control comes back to the loop.  "while (slot->in_use)" at the
> beginning of the loop was still true, but the original request the
> slot was being used for, the one that the run_active_slot() function
> cares about, has completed.
>
> So...

No, I suggested to replace the `finished` variable with an attribute (or
"field" or "member variable") of the slot, and to respect it when looking
for an unused slot, i.e. not only look for a slot whose `in_use` is 0 but
also require `reserved_for_use` to be 0. In essence, the
`run_active_slot()` function owns the slot, even if it is not marked as
`in_use`. That should address the same concern as baa7b67d but without
using a pointer to a local variable.

Ciao,
Dscho
Junio C Hamano May 24, 2022, 12:02 a.m. UTC | #8
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> It calls into cURL, which I suspect has a multi-threaded mode of
> operation,

https://curl.se/libcurl/c/threadsafe.html ;-)

My understanding is that what we have is pretty much select() driven
single-threaded multi-fd transfer.

> No, I suggested to replace the `finished` variable with an attribute (or
> "field" or "member variable") of the slot, and to respect it when looking
> for an unused slot, i.e. not only look for a slot whose `in_use` is 0 but
> also require `reserved_for_use` to be 0. In essence, the
> `run_active_slot()` function owns the slot, even if it is not marked as
> `in_use`. That should address the same concern as baa7b67d but without
> using a pointer to a local variable.

Not really.  An outer run_active_slot() and an inner
run_active_slot() have a pointer to the same slot object.

The inner one got hold of that object because the request the slot
used to represent for the outer run_active_slot() has finished, so
we would toggle either *(slot->finished) or the new slot->done in an
attempt to signal the completion to the outer run_active_slot() and
then make the slot not-in-use.  The slot becomes in-use again with a
different request and the inner run_active_slot() is run.  It first
says "this slot is not done yet---we are making a request using
it".  How would the inner one say that, exactly?

In baa7b67d's fix, it is done by setting slot->finished = &finished
to its own stackframe.  Because the outer run_active_slot() does not
look at slot->finished, but it looks at the finished on its
stackframe, what the inner run_active_slot() does here would not
break the outer one.

If we replace the mechanism with a separate member in the slot
structure, so that the outer run_active_slot() looks at slot->done
and the inner run_active_slot() also clears slot->done before
proceeding, then the inner one clobbers what the outer one will look
at when the recursive call that led to the inner one returns.
Daniel Stenberg May 24, 2022, 6:31 a.m. UTC | #9
On Mon, 23 May 2022, Junio C Hamano wrote:

>> It calls into cURL, which I suspect has a multi-threaded mode of
>> operation,
>
> https://curl.se/libcurl/c/threadsafe.html ;-)
>
> My understanding is that what we have is pretty much select() driven 
> single-threaded multi-fd transfer.

Confirmed. libcurl *can* use threads (if built that way), but the only use it 
has for such subthreads is for resolving host names. libcurl, its API and its 
callbacks etc always operate in the same single thread.
Johannes Schindelin May 24, 2022, 10:57 a.m. UTC | #10
Hi Daniel,

On Tue, 24 May 2022, Daniel Stenberg wrote:

> On Mon, 23 May 2022, Junio C Hamano wrote:
>
> > > It calls into cURL, which I suspect has a multi-threaded mode of
> > > operation,
> >
> > https://curl.se/libcurl/c/threadsafe.html ;-)
> >
> > My understanding is that what we have is pretty much select() driven
> > single-threaded multi-fd transfer.
>
> Confirmed. libcurl *can* use threads (if built that way), but the only use it
> has for such subthreads is for resolving host names. libcurl, its API and its
> callbacks etc always operate in the same single thread.

Great, thanks for clarifying!

Ciao,
Dscho

P.S.: I'm enjoying your book, Uncurled. It feels great to see validation
in some experiences (I'm a youngster compared to you, maintaining Open
Source software only since 2001 or so, but still), and to get a fresh and
inspiring perspective on others.
Johannes Schindelin May 24, 2022, 11:03 a.m. UTC | #11
Hi Junio,

On Mon, 23 May 2022, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > I suggested to replace the `finished` variable with an attribute (or
> > "field" or "member variable") of the slot, and to respect it when
> > looking for an unused slot, i.e. not only look for a slot whose
> > `in_use` is 0 but also require `reserved_for_use` to be 0. In essence,
> > the `run_active_slot()` function owns the slot, even if it is not
> > marked as `in_use`. That should address the same concern as baa7b67d
> > but without using a pointer to a local variable.
>
> Not really.  An outer run_active_slot() and an inner
> run_active_slot() have a pointer to the same slot object.

How is that possible? One of the first things that function does is to
assign `slot->finished = &finished`, and then run that `while (!finished)`
loop.

How would the outer `run_active_slot()` ever get signaled via `finished`
when the inner `run_active_slot()` would overwrite `slot->finished`? I am
puzzled why we do not see infinite loops in such outer calls all the time,
then.

Ciao,
Dscho
Junio C Hamano May 24, 2022, 5:15 p.m. UTC | #12
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> Not really.  An outer run_active_slot() and an inner
>> run_active_slot() have a pointer to the same slot object.
>
> How is that possible? One of the first things that function does is to
> assign `slot->finished = &finished`, and then run that `while (!finished)`
> loop.
>
> How would the outer `run_active_slot()` ever get signaled via `finished`
> when the inner `run_active_slot()` would overwrite `slot->finished`? I am
> puzzled why we do not see infinite loops in such outer calls all the time,
> then.

The idea in http subsystem goes like this.

 * Generally, we have multiple curl requests in flight.  A curlm
   passed to curl_multi_perform() call knows about them and attempts
   to make as much progress without blocking.

 * After calling curl_multi_perform(), we call process_curl_messages()
   to collect the response that corresponds to the request.  This is
   done using the slot data structure.  Once we read the response,
   we may process it further by making a callback.

 * A slot, when finished, can be reused.  THe reuse is controlled by
   its in_use member.

So, let's trace a code flow, http-walker.c::fetch_object() is used
as a sample starting point.

 * http-walker.c::fetch_object()
   - pushes the object name to object_request queue.
   - calls step_active_slots() to make progress.  This function in turn
     - calls curl_multi_perform() repeatedly to make progress
     - calls process_curl_messages() to possibly complete some active slots
     - calls fill_active_slots() to fill more requests.  This function
       calls the "fill" function repeatedly to make more requests,
       which is http-walker.c::fill_active_slot() in this code path.  It
       - repeatedly calls start_object_request()
         * start_object_request() does these:
           - calls new_http_object_request(), which prepares object-request
             structure, in which there is a slot member that was
	     obtained by calling get_active_slot().
	     * get_active_slot() does many things, but all we need to know	
	       here is that it does "in_use = 1".
           - sets callback for the slot to process_object_response()
           - calls start_active_slot(),
             which adds the slot to curlm and calls curl_multi_perform()
             to make progress on the active slots.
     - calls run_active_slots() repeatedly.

Now run_active_slots() we know about.  Before baa7b67d (HTTP slot
reuse fixes, 2006-03-10), we used to loop on slot->in_use but to fix
a bug we updated it to use slot->finished.

 * run_active_slot()
   - takes a slot
   - clears finished on its stack
   - makes slot->finished point at &finished on its stack
   - loops until "finished" is set
     - calls step_active_slots(); what it does can be seen above,
       but here, we need to know what process_curl_messages() it
       calls does, in order to complete some requests.
       * process_curl_messages() 
         - reads the response from curl
         - finds the slot with request that resulted in the response
         - sets its result member
         - calls finish_active_slot() on it, which in turn does these:
	   - calls closedown_active_slot(), slot->in_use becomes 0
           - sets (*slot->finished) = 1
	   - calls slot->callback_func

The callback_func was set to process_object_response() earlier in
this code flow.

 * http-walker.c::process_object_response()
   - calls process_http_object_request(), which dissociates the slot
     from the http_object_request object.
   - may call fetch_alternates() when the object is not found,
     otherwise calls finish_object_request().

Let's see what happens when fetch_alternates() gets called here.

 * http-walker.c::fetch_alternates()
   - calls step_active_slots() to make progress
   - calls get_active_slot() 
   - calls start_active_slot()
   - calls run_active_slot()

Now we can see how the "slot" we used in the "outer" run_active_slot()
can be reused for a different request.  We received response to the
request, and in process_curl_messages(), we called finish_active_slot()
on the slot, which did three things: (1) slot is now not-in-use, (2) the
"finished" on the stack of the outer run_active_slot() is set to 1, and
(3) called the process_object_response() callback.

The callback then asked for an unused slot, and got the slot we just
used, because we no longer need it (the necessary information in the
response have been copied away to http_object_request object before
the slot was dissociated from it, and the only one bit of
information the outer run_active_slot() needs has already been sent
there on its on-stack "finished" variable).  The reused slot goes
through the usual start_active_slot() call to add it to curlm, and
then the "inner" run_active_slot() is started on it.  Until the
inner run_active_slot() returns, fetch_alternates() would not
return, but once it does, the control goes back to the outer
run_active_slot(), where it finds that its "finished" is now set to
1.

This incidentally is a good illustration why the thread-starter
patch that did

	if (&finished == slot->finished)
		slot->finished = NULL;

would be sufficient, and the "clear only ours" guard is not
necessary, I think.  If the inner run_active_slot() did not trigger
a callback that adds more reuse of the slot, it will clear
slot->finished to NULL itself, with or without the guard.  And the
outer run_active_slot() may fail to clear if the guard is there, but
slot->finished is NULL in that case, so there is no point in clearing
it again.

And if the inner run_active_slot() did trigger a callback that ended
up reusing the slot, then eventually the innermost one would have
cleared slot->finished to NULL, with or without the guard, before it
returned the control to inner run_active_slot().  The inference goes
the same way to show that the guard is not necessary but is not
hurting.

I _think_ we can even get away by not doing anything to
slot->finished at the end of run_active_slot(), as we are not
multi-threaded and the callee only returns to the caller, but if it
helps pleasing the warning compiler, I'd prefer the simplest
workaround, perhaps with an unconditional clearing there?

What did I miss?  I must be missing something, as I can explain how
the current "(*slot->finished) = 1" with "while (finished)"
correctly works, but I cannot quite explain why the original "while
(slot->in_use)" would not, which is annoying.

In other words, why we needed baa7b67d (HTTP slot reuse fixes,
2006-03-10) in the first place?  It is possible that we had some
code paths that forgot to drop in_use before the inner run_active
returned that have been fixed in the 16 years and this fix was
hiding that bug, but I dunno.
Junio C Hamano May 24, 2022, 5:45 p.m. UTC | #13
Daniel Stenberg <daniel@haxx.se> writes:

> On Mon, 23 May 2022, Junio C Hamano wrote:
>
>>> It calls into cURL, which I suspect has a multi-threaded mode of
>>> operation,
>>
>> https://curl.se/libcurl/c/threadsafe.html ;-)
>>
>> My understanding is that what we have is pretty much select() driven
>> single-threaded multi-fd transfer.
>
> Confirmed. libcurl *can* use threads (if built that way), but the only
> use it has for such subthreads is for resolving host names. libcurl,
> its API and its callbacks etc always operate in the same single
> thread.

Thanks.  

It always is nice to have the authority/expert readily answering our
stupid questions ;-)
Carlo Marcelo Arenas Belón May 24, 2022, 8:16 p.m. UTC | #14
On Tue, May 24, 2022 at 10:15:57AM -0700, Junio C Hamano wrote:
> 
> I _think_ we can even get away by not doing anything to
> slot->finished at the end of run_active_slot(), as we are not
> multi-threaded and the callee only returns to the caller, but if it
> helps pleasing the warning compiler, I'd prefer the simplest
> workaround, perhaps with an unconditional clearing there?

Assuming that some overly clever compiler might optimize that out (either
because it might think it is Undefined Behaviour or for other unknown
reasons) then Ævar's version would be better for clearing the "warning".

But your patch fixed the "bug" that a probably overeager compiler was
"detecting".

> What did I miss?  I must be missing something, as I can explain how
> the current "(*slot->finished) = 1" with "while (finished)"
> correctly works, but I cannot quite explain why the original "while
> (slot->in_use)" would not, which is annoying.

My guess is that there is a curl version somewhere that is patched to use
threads more extensible than upstream and where this code is stil needed.
I think it is also safe to assume (like you did) that this is a 16 year bug
that was already fixed and reverting that code would be an alternative too.

Carlo
Ævar Arnfjörð Bjarmason May 24, 2022, 8:38 p.m. UTC | #15
On Tue, May 24 2022, Junio C Hamano wrote:

> [...]
> This incidentally is a good illustration why the thread-starter
> patch that did
>
> 	if (&finished == slot->finished)
> 		slot->finished = NULL;
>
> would be sufficient, and the "clear only ours" guard is not
> necessary, I think.  If the inner run_active_slot() did not trigger
> a callback that adds more reuse of the slot, it will clear
> slot->finished to NULL itself, with or without the guard.  And the
> outer run_active_slot() may fail to clear if the guard is there, but
> slot->finished is NULL in that case, so there is no point in clearing
> it again.
>
> And if the inner run_active_slot() did trigger a callback that ended
> up reusing the slot, then eventually the innermost one would have
> cleared slot->finished to NULL, with or without the guard, before it
> returned the control to inner run_active_slot().  The inference goes
> the same way to show that the guard is not necessary but is not
> hurting.
>
> I _think_ we can even get away by not doing anything to
> slot->finished at the end of run_active_slot(), as we are not
> multi-threaded and the callee only returns to the caller, but if it
> helps pleasing the warning compiler, I'd prefer the simplest
> workaround, perhaps with an unconditional clearing there?

I'll admit I haven't fully looked into this again, but does anything in
the subsequent analysis suggest that my original patch wouldn't be a
working solution to this, still:
https://lore.kernel.org/git/patch-1.1-1cec367e805-20220126T212921Z-avarab@gmail.com/ ?

The advantage of it over any small and narrow fix like your hunk quoted
above is that it doesn't make the end result look as though we care
about e.g. thread races, which evidently is something more than one
person looking at this has (needlessly) ended up worrying about.

> What did I miss?  I must be missing something, as I can explain how
> the current "(*slot->finished) = 1" with "while (finished)"
> correctly works, but I cannot quite explain why the original "while
> (slot->in_use)" would not, which is annoying.

Perhaps that change was also worried about thread safety? I only briefly
re-looked at this again, but I don't think I ever found exactly what
that 2006-era fix was meant to fix, specifically.

> In other words, why we needed baa7b67d (HTTP slot reuse fixes,
> 2006-03-10) in the first place?  It is possible that we had some
> code paths that forgot to drop in_use before the inner run_active
> returned that have been fixed in the 16 years and this fix was
> hiding that bug, but I dunno.

I haven't found that out either, either back in January or just now.
Ævar Arnfjörð Bjarmason May 24, 2022, 8:45 p.m. UTC | #16
On Tue, May 24 2022, Carlo Marcelo Arenas Belón wrote:

> On Tue, May 24, 2022 at 10:15:57AM -0700, Junio C Hamano wrote:
>> 
>> I _think_ we can even get away by not doing anything to
>> slot->finished at the end of run_active_slot(), as we are not
>> multi-threaded and the callee only returns to the caller, but if it
>> helps pleasing the warning compiler, I'd prefer the simplest
>> workaround, perhaps with an unconditional clearing there?
>
> Assuming that some overly clever compiler might optimize that out (either
> because it might think it is Undefined Behaviour or for other unknown
> reasons) then Ævar's version would be better for clearing the "warning".
>
> But your patch fixed the "bug" that a probably overeager compiler was
> "detecting".

Just briefly for those who perhaps didn't fully read the initial
thread. Per [1] and [2] (search for -fanalyzer in that [2]) it's not a
bug, undesired behavior etc. from GCC that it's "overeager" to warn in
this case.

Most warnings from compilers are in the category of not being triggered
on the basis of exhaustive code analysis which tries to prove that it's
a practical issue for *your codebase*. It's equally about warning you
about patterns that might be a problem in the future.

In this case I don't know how this line of reasoning started, or how the
output is confusing.

E.g. Johannes notes upthread that the "complaint is not actually
accurate". Well, yes it is, because the warning says:
	
	http.c: In function ‘run_active_slot’:
	http.c:1332:24: warning: storing the address of local variable ‘finished’ in ‘*slot.finished’ [-Wdangling-pointer=]
	 1332 |         slot->finished = &finished;
	      |         ~~~~~~~~~~~~~~~^~~~~~~~~~~

I.e. it's telling us that we're *storing* the address, which we're
doing. "Storing" meaning "past the lifetime of the function".

It doesn't mean that GCC has additionally proved that we'll later used
it in a way that will have a meaningful impact on the behavior of our
program, or even that it's tried to do that. See an excerpt from the GCC
code (a comment) in [1].

1. https://lore.kernel.org/git/220127.86mtjhdeme.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220414.86h76vd69t.gmgdl@evledraar.gmail.com/
Junio C Hamano May 24, 2022, 10:16 p.m. UTC | #17
Carlo Marcelo Arenas Belón <carenas@gmail.com> writes:

> My guess is that there is a curl version somewhere that is patched to use
> threads more extensible than upstream and where this code is stil needed.
> I think it is also safe to assume (like you did) that this is a 16 year bug
> that was already fixed and reverting that code would be an alternative too.

Sorry, but this does not make any sense to me.

It wasn't like we were working around somebody else's bug 16 years
ago.  Reverting the bugfix we made 16 years ago will make the issue
fixed 16 years ago to reappear.

Also, even if your version of curl library uses multi-threading
internally, it cannot magically make _our_ calls into curl library
to be multi-threaded, letting a control flow that came from
run_active_slot() down to another recursive invocation of
run_active_slot() to spin there calling curl_multi_perform()
repeatedly, while returning the control to the outer
run_active_slot() at the same time.  So "a curl version somewhere
that is patched" does not sound like a plausible explanation,
either.
Junio C Hamano May 24, 2022, 10:28 p.m. UTC | #18
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> I _think_ we can even get away by not doing anything to
>> slot->finished at the end of run_active_slot(), as we are not
>> multi-threaded and the callee only returns to the caller, but if it
>> helps pleasing the warning compiler, I'd prefer the simplest
>> workaround, perhaps with an unconditional clearing there?
>
> I'll admit I haven't fully looked into this again, but does anything in
> the subsequent analysis suggest that my original patch wouldn't be a
> working solution to this, still:
> https://lore.kernel.org/git/patch-1.1-1cec367e805-20220126T212921Z-avarab@gmail.com/ ?

I traced _one_ code path as a demonstration to show why the current
"slot->finished = &finished" based solution works.  

But I think what we need is to demonstrate a code path in the old
version that shows why the old slot->in_use would not have worked
and the slot->finished was needed, and demonstrate why it NO LONGER
is the case in today's code.  Without that, especially with the
latter, I cannot take the "just revert 16-year old bugfix because a
new compiler throws a warning related to multi-threaded code to it,
even though we are strictly single-threaded" as a serious solution.

And because I do not think I've seen anybody has done that necessary
digging, I would still prefer the "if the compiler somehow cares,
then let's clear the finished member once we are done with it" much
better than "we do not know why but we somehow think we can do
without this bugfix, even though we wouldn't be making noises about
this piece of code if a new compiler did not start emitting a
warning".

Thanks.
Junio C Hamano May 24, 2022, 10:34 p.m. UTC | #19
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> It doesn't mean that GCC has additionally proved that we'll later used
> it in a way that will have a meaningful impact on the behavior of our
> program, or even that it's tried to do that. See an excerpt from the GCC
> code (a comment) in [1].

But that means the warning just as irrelevant as "you stored 438 to
this integer variable".  Sure, there may be cases where that integer
variable should not exceed 400 and if the compiler can tell us that,
that would be a valuable help to developers.  But "you stored an
address of an object that can go out of scope in another object
whose lifetime lasts beyond the scope" alone is not, without "and
the caller that passed the latter object later dereferences that
address here".  We certainly shouldn't -Werror on such a warning
and bend our code because of it.
Junio C Hamano May 24, 2022, 11:19 p.m. UTC | #20
Carlo Marcelo Arenas Belón <carenas@gmail.com> writes:

> On Tue, May 24, 2022 at 10:15:57AM -0700, Junio C Hamano wrote:
>> 
>> I _think_ we can even get away by not doing anything to
>> slot->finished at the end of run_active_slot(), as we are not
>> multi-threaded and the callee only returns to the caller, but if it
>> helps pleasing the warning compiler, I'd prefer the simplest
>> workaround, perhaps with an unconditional clearing there?
>
> Assuming that some overly clever compiler might optimize that out (either
> because it might think it is Undefined Behaviour or for other unknown
> reasons) then Ævar's version would be better for clearing the "warning".

You keep saying undefined behaviour but in this case I do not quite
see there is anything undefined.

The warning, as Ævar said in a message, is about storing an address
of an object on the stack in another object whose lifetime lasts
beyond the life of the stackframe.  If you dereference such a
pointer _after_ we return from run_active_slot() function, the
behaviour may indeed be undefined.

But if you recall one such call trace I walked through for Dscho in
another message this morning, we do not make such a dereferencing.
The run_active_slot() function sets up the slot with the pointer to
its on-stack variable in it, we make a call chain that is several
levels deep, and at some point in the call chain, the request that
is represented by the slot may complete and slot may be passed to
finish_active_slot(), which would update (*slot->finished) thus
modifying the on-stack variable of the run_active_slot() that we
will eventually return to.

Is such a use of the pointer in the structure a cause for an
undefined behaviour?
Carlo Marcelo Arenas Belón May 25, 2022, 2:02 a.m. UTC | #21
Please, I apologize in advance because I am making statements way
above my paygrade and without the relevant foundation (because as I
mentioned, I looked at it briefly a while ago, and hadn't been able to
get the time to complete my analysis).

I did read all your analysis and while as Aevar pointed out, I might
be mistaken, or not making myself clear enough, it is not intentional
and if told to do so, will extract myself from this thread until I can
do a full analysis.

On Tue, May 24, 2022 at 4:19 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Is such a use of the pointer in the structure a cause for an
> undefined behaviour?

If you make an IMHO overly strict reading of the standard (which it
would seem gcc12 implementers might have done) then ANY access to a
pointer that might be out of scope is Undefined Behaviour.

gcc doesn't know what happens to the variable after it gets shipped
out of this function inside that opaque structure that is passed
around between threads in curl, so it MIGHT reasonably assume that:

1) There is a layering violation somewhere and another thread is
modifying that field to point it to a different thread local from the
same function.
2) Once it gets the value back, then it can assume that reading that
pointer might be undefined behaviour because it might be coming from a
different stack frame than ours (after all, any sane developer would
store instead a flag)
3) since the if uses a condition that is UB then it can optimize it out

I can't see any other reason why in the original code, with the if,
the warning was still being triggered, but not without it.

BTW while trying to debug gcc to find out if this is a bug or a lost
opportunity for optimization of something else, I noticed that it
would only trigger in cases where the structure was shipped outside
the function through another function, not when it was returned back
from it, for example.

I don't even have gcc-12 in the current workstation I am using, but
would prepare a machine with it and debug further if you think giving
a more complete answer is necessary.

Carlo
Michael J Gruber May 25, 2022, 9:08 a.m. UTC | #22
Junio C Hamano venit, vidit, dixit 2022-05-25 00:34:40:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> 
> > It doesn't mean that GCC has additionally proved that we'll later used
> > it in a way that will have a meaningful impact on the behavior of our
> > program, or even that it's tried to do that. See an excerpt from the GCC
> > code (a comment) in [1].
> 
> But that means the warning just as irrelevant as "you stored 438 to
> this integer variable".  Sure, there may be cases where that integer
> variable should not exceed 400 and if the compiler can tell us that,
> that would be a valuable help to developers. 

An integer can hold 438 perfectly, without any help by carefully
designed code.

> But "you stored an
> address of an object that can go out of scope in another object
> whose lifetime lasts beyond the scope" alone is not, without "and
> the caller that passed the latter object later dereferences that
> address here".  We certainly shouldn't -Werror on such a warning
> and bend our code because of it.

A global variable cannot hold a meaningful pointer to a local variable,
unless the carefully designed code helps. So that "analogy" rather
highlights the essential difference, unless you think about a pointer
as "just some number" rather than "something that can be dereferenced".

[read global as outer scope, local as inner scope for simplicity]

Common practice is not necessarily good practice. In a traditional C
mindset, everything around pointers and memory management is doomed to
boom unless the code is designed carefully and "you know what you are
doing". I'm not indicating that you do not - on the contrary, you very
well do, as your detailed analysis of the code flow shows. At the same
time, it shows that we cannot be certain about that piece of code
without that detailed expert analysis.

C is not C++ nor rust, nor should it be; but the warnings and errors in
newer standards typically try to avoid those pitfalls by making sure
that, e.g., pointers do not go stale for "obvious reasons". They might
missflag cases where this is preempted for non-obvious reasons, but forcing
you to be explicit (more obvious) in your code is a good thing,
especially for maintainability of the code base.

Pointing from outer scope to memory in an inner scope should be a no-go;
that's what this error is about. Unsetting that pointer (by setting the
pointer to NULL) right before the inner scope ends is exactly the
right solution. If this "breaks" the code, the code is broken already.

Ironically, my original one line patch seems to work here, as your
detailed analysis shows. Truth in advertising: I arrived at that patch
after a considerably less detailed analysis ;)

Michael
Johannes Schindelin May 25, 2022, 10:07 a.m. UTC | #23
Hi Junio,

On Tue, 24 May 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
> >> I _think_ we can even get away by not doing anything to
> >> slot->finished at the end of run_active_slot(), as we are not
> >> multi-threaded and the callee only returns to the caller, but if it
> >> helps pleasing the warning compiler, I'd prefer the simplest
> >> workaround, perhaps with an unconditional clearing there?
> >
> > I'll admit I haven't fully looked into this again, but does anything in
> > the subsequent analysis suggest that my original patch wouldn't be a
> > working solution to this, still:
> > https://lore.kernel.org/git/patch-1.1-1cec367e805-20220126T212921Z-avarab@gmail.com/ ?
>
> I traced _one_ code path as a demonstration to show why the current
> "slot->finished = &finished" based solution works.
>
> But I think what we need is to demonstrate a code path in the old
> version that shows why the old slot->in_use would not have worked
> and the slot->finished was needed, and demonstrate why it NO LONGER
> is the case in today's code.  Without that, especially with the
> latter, I cannot take the "just revert 16-year old bugfix because a
> new compiler throws a warning related to multi-threaded code to it,
> even though we are strictly single-threaded" as a serious solution.
>
> And because I do not think I've seen anybody has done that necessary
> digging, I would still prefer the "if the compiler somehow cares,
> then let's clear the finished member once we are done with it" much
> better than "we do not know why but we somehow think we can do
> without this bugfix, even though we wouldn't be making noises about
> this piece of code if a new compiler did not start emitting a
> warning".

The commit in question is baa7b67d091 (HTTP slot reuse fixes, 2006-03-10),
and I did look around in the Git mailing list archives for mails that were
sent around the same date, but did not see much that would help understand
the context, except that the patch series clearly talks about `http-push`:
https://lore.kernel.org/git/20060311041749.GB3997@reactrix.com/

The thing about `http-push` is that it adds a "fill function" that is
executed in `fill_active_slots()`, which is called in turn by
`step_active_slots()`, which, as you will recall, is called within that
busy loop in `run_active_slot()`.

And that "fill function" is where it starts to get interesting.

It's called `fill_active_slot()`:
https://github.com/git/git/blob/v2.36.1/http-push.c#L604-L625

This function starts requests, such as fetching loose objects, starting a
PUT or a MKCOL. Notably, though, `fill_active_slot()` does not wait for
the request to finish. In other words, it will potentially reuse the
current slot if it was _just_ marked as no longer in use, and then the
code flow will eventually return to that loop in `run_active_slot()`, with
any reused slot still being marked as `in_use`.

So yes, reverting that commit would reintroduce the regression, and I am
very happy that we now have a grip on this Chesterton's Fence.

This same analysis, of course, also puts a nail into the coffin of the
`reserved_for_use` idea because while it would fix the reuse bug, it would
unnecessarily squat on slots that might well be needed.

Ciao,
Dscho
Ævar Arnfjörð Bjarmason May 25, 2022, 1:27 p.m. UTC | #24
On Tue, May 24 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> It doesn't mean that GCC has additionally proved that we'll later used
>> it in a way that will have a meaningful impact on the behavior of our
>> program, or even that it's tried to do that. See an excerpt from the GCC
>> code (a comment) in [1].
>
> But that means the warning just as irrelevant as "you stored 438 to
> this integer variable".  Sure, there may be cases where that integer
> variable should not exceed 400 and if the compiler can tell us that,
> that would be a valuable help to developers.  But "you stored an
> address of an object that can go out of scope in another object
> whose lifetime lasts beyond the scope" alone is not, without "and
> the caller that passed the latter object later dereferences that
> address here".  We certainly shouldn't -Werror on such a warning
> and bend our code because of it.

I think it says something that 1) we had exactly one of these in our
codebase 2) as we've discussed the pointer isn't actually *needed*
outside the scope of the function, it's just left-over.

Now, if it were used, e.g. let's say we had some code that took the
struct and inspected its members we'd likely have a segfault here, or
worse it would "work", but only on the platforms we'd test at first.

Which isn't the case with a leftover "int finished" holding a 438.

The point of this warning, like so many others, is to ask "hey, do you
really need to be running around with this particular pair of
scissors?".
Junio C Hamano May 25, 2022, 4:40 p.m. UTC | #25
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> This same analysis, of course, also puts a nail into the coffin of the
> `reserved_for_use` idea because while it would fix the reuse bug, it would
> unnecessarily squat on slots that might well be needed.

It is like that an in-kernel structure that represents a process has
to stay around in the zombie state until its exit status is culled.
With s/reserved_for_use/zombie/ the name of the new member would
make more sense ;-)

With the "slot->finished" trick, compared to the approach to delay
the reuse, we can reuse them a bit earlier, but because I do not
think we accumulate unbounded number of these zombie requests, and
when we run out the active slots in the active queue, and because
get_active_slot() will allocate a new one, the wastage might not be
too bad.

So, I am not sure if it is that bad to be called a nail in the
coffin.

In any case, https://github.com/git/git/actions/runs/2381379417 is
the run with the single liner "clear slot->finished before leaving"
with your other 3 gcc12 fixes.  The tests are not clean because we
have linux-leaks complaining on ds/bundle-uri-more RFC patches and
win test (9) seems to have issues with t7527 (fsmonitor), but I am
taking the fact that any of the "win test" jobs even start as an
evidence that we have pleased gcc12 enough to get there?

Thanks.
Daniel Stenberg May 26, 2022, 2:15 p.m. UTC | #26
On Tue, 24 May 2022, Junio C Hamano wrote:

> It always is nice to have the authority/expert readily answering our stupid 
> questions ;-)

I probably miss a few here and there, but I do keep an eye out for curl 
related subjects where I can help. After all, I'm git fan and user myself.
diff mbox series

Patch

diff --git a/http.c b/http.c
index 229da4d148..85437b1980 100644
--- a/http.c
+++ b/http.c
@@ -1367,6 +1367,9 @@  void run_active_slot(struct active_request_slot *slot)
 			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
 		}
 	}
+
+	if (slot->finished == &finished)
+		slot->finished = NULL;
 }
 
 static void release_active_slot(struct active_request_slot *slot)