diff mbox series

Re* jc/http-clear-finished-pointer

Message ID xmqq7d68ytj8.fsf_-_@gitster.g (mailing list archive)
State New, archived
Headers show
Series Re* jc/http-clear-finished-pointer | expand

Commit Message

Junio C Hamano May 26, 2022, 7:37 p.m. UTC
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Thu, May 26 2022, Junio C Hamano wrote:
>
>> * jc/http-clear-finished-pointer (2022-05-24) 1 commit
>>  - http.c: clear the 'finished' member once we are done with it
>>
>>  Meant to go with js/ci-gcc-12-fixes
>>
>>  Will merge to 'next'?
>>  source: <xmqqczgqjr8y.fsf_-_@gitster.g>
>
> The end of the proposed commit message says:
>
>     [...]Clear the finished member before the control leaves the
>     function, which has a side effect of unconfusing compilers like
>     recent GCC 12 that is over-eager to warn against such an assignment.
>
> I cannot reproduce this suppressing the warning as noted in past
> exchanges, it's not affected by this "clear if we set it" pattern. It
> needs to be unconditionally cleared.

Interesting.  I still have conditional clearing in the tree, though
I was reasonably sure I got rid of the conditional and made it
always clear, when I rewrote that part of the log message.  After
all, I ran "commit --amend" so that I do not forget the issue after
sending https://lore.kernel.org/git/xmqqleurlt31.fsf@gitster.g/ X-<.

Thanks for catching.  What is queued is not what I intended to
queue.

But there is one thing that is puzzling.  Ever since this, together
with the three patches from Dscho for gcc12, got included in 'seen',
the branch started passing the Windows build that used to complain
and did not work, so at least with the version of gcc12 used over
there, it apparently is sufficient to clear only when we are
responsible for placing an address that is about to become invalid,
while leaving the pointer we didn't stuff in unmodified.

As far as I understand, with the most recent analysis by Dscho on
the http-push codepath, we can return to the loop while the slot is
holding a different request that is unrelated to ours that has
already finished without recursively calling run_active_slot(), and
with the current *(slot->finished)=1 trick, it will successfully
notify our loop that our request is done, even though slot->in_use
is set to true back again when it happens.  But by definition, at
that point, slot->finished is not used by anybody (obviously not by
us, but also not by the request that is currently using the slot,
because it hasn't used run_active_slot() and slot->finished is not
touched by it), so it is safe to unconditionally clear the member.

----- >8 --------- >8 --------- >8 --------- >8 --------- >8 -----
Subject: [PATCH v3] http.c: clear the 'finished' member once we are done with it

In http.c, the run_active_slot() function allows the given "slot" to
make progress by calling step_active_slots() in a loop repeatedly,
and the loop is not left until the request held in the slot
completes.

Ages ago, we used to use the slot->in_use member to get out of the
loop, which misbehaved when the request in "slot" completes (at
which time, the result of the request is copied away from the slot,
and the in_use member is cleared, making the slot ready to be
reused), and the "slot" gets reused to service a different request
(at which time, the "slot" becomes in_use again, even though it is
for a different request).  The loop terminating condition mistakenly
thought that the original request has yet to be completed.

Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10)
fixed this issue, uses a separate "slot->finished" member that is
set in run_active_slot() to point to an on-stack variable, and the
code that completes the request in finish_active_slot() clears the
on-stack variable via the pointer to signal that the particular
request held by the slot has completed.  It also clears the in_use
member (as before that fix), so that the slot itself can safely be
reused for an unrelated request.

One thing that is not quite clean in this arrangement is that,
unless the slot gets reused, at which point the finished member is
reset to NULL, the member keeps the value of &finished, which
becomes a dangling pointer into the stack when run_active_slot()
returns.  Clear the finished member before the control leaves the
function, which has a side effect of unconfusing compilers like
recent GCC 12 that is over-eager to warn against such an assignment.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 http.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

Comments

Johannes Schindelin May 27, 2022, 8:41 p.m. UTC | #1
Hi Junio,

On Thu, 26 May 2022, Junio C Hamano wrote:

> diff --git a/http.c b/http.c
> index 229da4d148..9a98372f74 100644
> --- a/http.c
> +++ b/http.c
> @@ -1367,6 +1367,32 @@ void run_active_slot(struct active_request_slot *slot)
>  			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
>  		}
>  	}
> +
> +	/*
> +	 * The value of slot->finished we set before the loop was used
> +	 * to set our "finished" variable when our request completed.
> +	 *
> +	 * 1. The slot may not have been reused for another requst
> +	 *    yet, in which case it still has &finished.
> +	 *
> +	 * 2. The slot may already be in-use to serve another request,
> +	 *    which can further be divided into two cases:
> +	 *
> +	 * (a) If call run_active_slot() hasn't been called for that
> +	 *     other request, slot->finished may still have the
> +	 *     address of our &finished.
> +	 *
> +	 * (b) If the request did call run_active_slot(), then the
> +	 *     call would have updated slot->finished at the beginning
> +	 *     of this function, and with the clearing of the member
> +	 *     below, we would find that slot->finished is now NULL.
> +	 *
> +	 * In all cases, slot->finished has no useful information to
> +	 * anybody at this point.  Some compilers warn us for
> +	 * attempting to smuggle a pointer that is about to become
> +	 * invalid, i.e. &finished.  We clear it here to assure them.
> +	 */
> +	slot->finished = NULL;
>  }
>
>  static void release_active_slot(struct active_request_slot *slot)
> --
> 2.36.1-306-g0dbcc0e187

I just verified that there is currently no other location in Git's code
that assigns a non-NULL value to `slot->finished` than
`run_active_slot()`. Otherwise we would potentially overwrite the value
here (which is why I preferred the conditional assignment, which does not
shut up GCC though). So for now, this solution is safe.

Having said that, it is quite puzzling that GCC thinks it is safe to
assign a local variable's pointer to a struct that is then accessed
outside the current file. This would make it easy to copy and use the
pointer well after the function scope was left. This is _not_ the case in
Git's source code, but GCC seems that this isn't possible by
(mis-)interpreting the final `slot->finished = NULL` to mean that the
`slot->finished = &finished` was safe (because it clearly isn't). In GCC's
defense, there is probably a lot of code out there that would no longer
compile if they truly enforced the new `-Wdangling-pointer` rule
correctly.

With all that, here is my ACK.

Ciao,
Dscho
Junio C Hamano May 27, 2022, 9:35 p.m. UTC | #2
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

This analysis and explanation was actually still too conservative.

>> +	/*
>> +	 * The value of slot->finished we set before the loop was used
>> +	 * to set our "finished" variable when our request completed.
>> +	 *
>> +	 * 1. The slot may not have been reused for another requst
>> +	 *    yet, in which case it still has &finished.
>> +	 *
>> +	 * 2. The slot may already be in-use to serve another request,
>> +	 *    which can further be divided into two cases:
>> +	 *
>> +	 * (a) If call run_active_slot() hasn't been called for that
>> +	 *     other request, slot->finished may still have the
>> +	 *     address of our &finished.
>> +	 *
>> +	 * (b) If the request did call run_active_slot(), then the
>> +	 *     call would have updated slot->finished at the beginning
>> +	 *     of this function, and with the clearing of the member
>> +	 *     below, we would find that slot->finished is now NULL.

Given that there are only two places that assign to slot->finished
(one is at the beginning of run_active_slot(), the other is when
get_active_slot() either allocates a new one or decides to reuse the
one that finish_active_slot() has marked to be no longer in use),
all in "reused" cases, not just 2-(b), but even in case 2-(a),
slot->finished should be NULL.

>> +	 * In all cases, slot->finished has no useful information to
>> +	 * anybody at this point.  Some compilers warn us for
>> +	 * attempting to smuggle a pointer that is about to become
>> +	 * invalid, i.e. &finished.  We clear it here to assure them.
>> +	 */
>> +	slot->finished = NULL;
>>  }

The conclusion is still valid.  The slot->finished member will not
have any useful information when we get here.  It either keeps the
value &finished we stored if the slot wasn't reused, or it has NULL
if it was reused.

> Having said that, it is quite puzzling that GCC thinks it is safe to
> assign a local variable's pointer to a struct that is then accessed
> outside the current file. This would make it easy to copy and use the
> pointer well after the function scope was left. This is _not_ the case in
> Git's source code, but GCC seems that this isn't possible by
> (mis-)interpreting the final `slot->finished = NULL` to mean that the
> `slot->finished = &finished` was safe (because it clearly isn't).

I do not think it is compiler's job to prevent us from assigning to
slot->finished, in fear of somebody downstream copying the pointer
away to a global variable that can be accessed long after we return.

We can store a pointer to an object we malloc() in a member of a
struct, pass the struct to somebody else, who might copy the member
away in a global variable for their own use later, but when they
give control back to us, we may free() the object via the member in
the struct.  The compiler may see that we free() the pointer we stored
in the struct, but it would not warn us against doing so.

> With all that, here is my ACK.

Thanks.
Ævar Arnfjörð Bjarmason June 1, 2022, 7:26 a.m. UTC | #3
On Thu, May 26 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Thu, May 26 2022, Junio C Hamano wrote:
>>
>>> * jc/http-clear-finished-pointer (2022-05-24) 1 commit
>>>  - http.c: clear the 'finished' member once we are done with it
>>>
>>>  Meant to go with js/ci-gcc-12-fixes
>>>
>>>  Will merge to 'next'?
>>>  source: <xmqqczgqjr8y.fsf_-_@gitster.g>
>>
>> The end of the proposed commit message says:
>>
>>     [...]Clear the finished member before the control leaves the
>>     function, which has a side effect of unconfusing compilers like
>>     recent GCC 12 that is over-eager to warn against such an assignment.
>>
>> I cannot reproduce this suppressing the warning as noted in past
>> exchanges, it's not affected by this "clear if we set it" pattern. It
>> needs to be unconditionally cleared.
>
> Interesting.  I still have conditional clearing in the tree, though
> I was reasonably sure I got rid of the conditional and made it
> always clear, when I rewrote that part of the log message.  After
> all, I ran "commit --amend" so that I do not forget the issue after
> sending https://lore.kernel.org/git/xmqqleurlt31.fsf@gitster.g/ X-<.
>
> Thanks for catching.  What is queued is not what I intended to
> queue.
>
> But there is one thing that is puzzling.  Ever since this, together
> with the three patches from Dscho for gcc12, got included in 'seen',
> the branch started passing the Windows build that used to complain
> and did not work, so at least with the version of gcc12 used over
> there, it apparently is sufficient to clear only when we are
> responsible for placing an address that is about to become invalid,
> while leaving the pointer we didn't stuff in unmodified.

I didn't find what specific build(s) you were referring to, but perhaps
this is due to an interaction with 9c539d1027d (config.mak.dev:
alternative workaround to gcc 12 warning in http.c, 2022-04-15)?
I.e. with DEVELOPER=1 we'd already get past the warning with
gcc+Windows, presumably (but I haven't confirmed whether the
detect-compiler etc. works the same there).

> As far as I understand, with the most recent analysis by Dscho on
> the http-push codepath, we can return to the loop while the slot is
> holding a different request that is unrelated to ours that has
> already finished without recursively calling run_active_slot(), and
> with the current *(slot->finished)=1 trick, it will successfully
> notify our loop that our request is done, even though slot->in_use
> is set to true back again when it happens.  But by definition, at
> that point, slot->finished is not used by anybody (obviously not by
> us, but also not by the request that is currently using the slot,
> because it hasn't used run_active_slot() and slot->finished is not
> touched by it), so it is safe to unconditionally clear the member.
>
> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 -----
> Subject: [PATCH v3] http.c: clear the 'finished' member once we are done with it

I see this is in "next" already, as a follow-up we should have a revert
of 9c539d1027d (config.mak.dev: alternative workaround to gcc 12 warning
in http.c, 2022-04-15) which this patch makes redundant.
Junio C Hamano June 1, 2022, 3:48 p.m. UTC | #4
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> ... as a follow-up we should have a revert
> of 9c539d1027d (config.mak.dev: alternative workaround to gcc 12 warning
> in http.c, 2022-04-15) which this patch makes redundant.

It is a good point that 9c539d10 (config.mak.dev: alternative
workaround to gcc 12 warning in http.c, 2022-04-15) hides problems.
Unfortunatelly https://github.com/git/git/runs/6665825113 dies early
without reaching the point where it touches http.c X-< so we cannot
tell.

I am OK to do a revert of the redundant one on 'next', which should
give us a confirmation that the particular version of GCC12 is OK
with the code we happen to have and letting its -Wdangling-pointer
trigger would not harm us with any other false positives.  Then we
can do the same on the 'master' front with that revert with Dscho's
other three GCC12 fixes (they are fixes, like the "realloc then use"
one) graduating first (which should break http.c) and then the patch
in this thread as separate steps, if we really wanted to.

Thanks.
diff mbox series

Patch

diff --git a/http.c b/http.c
index 229da4d148..9a98372f74 100644
--- a/http.c
+++ b/http.c
@@ -1367,6 +1367,32 @@  void run_active_slot(struct active_request_slot *slot)
 			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
 		}
 	}
+
+	/*
+	 * The value of slot->finished we set before the loop was used
+	 * to set our "finished" variable when our request completed.
+	 *
+	 * 1. The slot may not have been reused for another requst
+	 *    yet, in which case it still has &finished.
+	 *
+	 * 2. The slot may already be in-use to serve another request,
+	 *    which can further be divided into two cases:
+	 *
+	 * (a) If call run_active_slot() hasn't been called for that
+	 *     other request, slot->finished may still have the
+	 *     address of our &finished.
+	 *
+	 * (b) If the request did call run_active_slot(), then the
+	 *     call would have updated slot->finished at the beginning
+	 *     of this function, and with the clearing of the member
+	 *     below, we would find that slot->finished is now NULL.
+	 *
+	 * In all cases, slot->finished has no useful information to
+	 * anybody at this point.  Some compilers warn us for
+	 * attempting to smuggle a pointer that is about to become
+	 * invalid, i.e. &finished.  We clear it here to assure them.
+	 */
+	slot->finished = NULL;
 }
 
 static void release_active_slot(struct active_request_slot *slot)