diff mbox series

xen: Use -Wuninitialized and -Winit-self

Message ID 20231228193907.3052681-1-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series xen: Use -Wuninitialized and -Winit-self | expand

Commit Message

Andrew Cooper Dec. 28, 2023, 7:39 p.m. UTC
The use of uninitialised data is undefined behaviour.  At -O2 with trivial
examples, both Clang and GCC delete the variable, and in the case of a
function return, the caller gets whatever was stale in %rax prior to the call.

Clang includes -Wuninitialized within -Wall, but GCC only includes it in
-Wextra, which is not used by Xen at this time.

Furthermore, the specific pattern of assigning a variable to itself in its
declaration is only diagnosed by GCC with -Winit-self.  Clang does diagnoise
simple forms of this pattern with a plain -Wuninitialized, but it fails to
diagnose the instances in Xen that GCC manages to find.

GCC, with -Wuninitialized and -Winit-self notices:

  arch/x86/time.c: In function ‘read_pt_and_tsc’:
  arch/x86/time.c:297:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
    297 |     uint32_t best = best;
        |              ^~~~
  arch/x86/time.c: In function ‘read_pt_and_tmcct’:
  arch/x86/time.c:1022:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
   1022 |     uint64_t best = best;
        |              ^~~~

and both have logic paths where best can be returned while uninitalised.  In
both cases, initialise to ~0 like the associated *_min variable which also
gates updating best.

Fixes: 23658e823238 ("x86/time: further improve TSC / CPU freq calibration accuracy")
Fixes: 3f3906b462d5 ("x86/APIC: calibrate against platform timer when possible")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

Full Gitlab:
  https://gitlab.com/xen-project/people/andyhhp/xen/-/pipelines/1121715339
---
 xen/Makefile        | 3 ++-
 xen/arch/x86/time.c | 4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

Comments

Jan Beulich Jan. 4, 2024, 1:41 p.m. UTC | #1
On 28.12.2023 20:39, Andrew Cooper wrote:
> The use of uninitialised data is undefined behaviour.  At -O2 with trivial
> examples, both Clang and GCC delete the variable, and in the case of a
> function return, the caller gets whatever was stale in %rax prior to the call.
> 
> Clang includes -Wuninitialized within -Wall, but GCC only includes it in
> -Wextra, which is not used by Xen at this time.
> 
> Furthermore, the specific pattern of assigning a variable to itself in its
> declaration is only diagnosed by GCC with -Winit-self.  Clang does diagnoise
> simple forms of this pattern with a plain -Wuninitialized, but it fails to
> diagnose the instances in Xen that GCC manages to find.
> 
> GCC, with -Wuninitialized and -Winit-self notices:
> 
>   arch/x86/time.c: In function ‘read_pt_and_tsc’:
>   arch/x86/time.c:297:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>     297 |     uint32_t best = best;
>         |              ^~~~
>   arch/x86/time.c: In function ‘read_pt_and_tmcct’:
>   arch/x86/time.c:1022:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>    1022 |     uint64_t best = best;
>         |              ^~~~
> 
> and both have logic paths where best can be returned while uninitalised.

I disagree. In both cases the variables are reliably set during the first
loop iteration. Therefore I also disagree that there want to be Fixes: tags
here. There's one case where initialization could be bypassed, but that's a
purely theoretical case afaict.

Furthermore this initialize-to-self is a well known pattern to suppress the
-Wuninitialized induced warnings, originally used by Linux'es
uninitialized_var(). If we really want to use -Winit-self (and hence disallow
use of this pattern even in cases like the ones here, where they're used to
suppress false positive warnings), this should imo be done separately from
adding -Wuninitialized, and only after proper weighing of the pros and cons
(a wider Cc list would be required anyway for the xen/Makefile change).

>  In
> both cases, initialise to ~0 like the associated *_min variable which also
> gates updating best.

Considering the affected functions are both __init, this change isn't a big
problem. But if you were truly concerned of the one theoretical case, you
can't get away with this either: If the variables really remained unwritten,
by returning ~0 you'd end up confusing the caller.

Jan
Andrew Cooper Jan. 4, 2024, 2:33 p.m. UTC | #2
On 04/01/2024 1:41 pm, Jan Beulich wrote:
> On 28.12.2023 20:39, Andrew Cooper wrote:
>> The use of uninitialised data is undefined behaviour.  At -O2 with trivial
>> examples, both Clang and GCC delete the variable, and in the case of a
>> function return, the caller gets whatever was stale in %rax prior to the call.
>>
>> Clang includes -Wuninitialized within -Wall, but GCC only includes it in
>> -Wextra, which is not used by Xen at this time.
>>
>> Furthermore, the specific pattern of assigning a variable to itself in its
>> declaration is only diagnosed by GCC with -Winit-self.  Clang does diagnoise
>> simple forms of this pattern with a plain -Wuninitialized, but it fails to
>> diagnose the instances in Xen that GCC manages to find.
>>
>> GCC, with -Wuninitialized and -Winit-self notices:
>>
>>   arch/x86/time.c: In function ‘read_pt_and_tsc’:
>>   arch/x86/time.c:297:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>>     297 |     uint32_t best = best;
>>         |              ^~~~
>>   arch/x86/time.c: In function ‘read_pt_and_tmcct’:
>>   arch/x86/time.c:1022:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>>    1022 |     uint64_t best = best;
>>         |              ^~~~
>>
>> and both have logic paths where best can be returned while uninitalised.
> I disagree. In both cases the variables are reliably set during the first
> loop iteration.

I suggest you pay attention to the precision of the integers.

It is hard (likely prohibitively hard) to avoid entering the if(), but
it is not impossible.

The compiler really has emitted logic paths where stack rubble is returned.

> Furthermore this initialize-to-self is a well known pattern to suppress the
> -Wuninitialized induced warnings, originally used by Linux'es
> uninitialized_var().

I'm glad you cited this, because it proves my point.

Notice how it was purged from Linux slowly over the course of 8 years
because it had been shown to create real bugs, by hiding real uses of
uninitialised variables.

I'm honestly surprised that it hasn't come up yet in the MISRA work.

>  If we really want to use -Winit-self (and hence disallow
> use of this pattern even in cases like the ones here, where they're used to
> suppress false positive warnings), this should imo be done separately from
> adding -Wuninitialized, and only after proper weighing of the pros and cons
> (a wider Cc list would be required anyway for the xen/Makefile change).

There are exactly two uses of this antipattern in the entirety of Xen. 
They are both in x86 init code.

Do you honestly think trying to block a patch this clear and obvious is
going to be a good use of anyone's time.

>
>>  In
>> both cases, initialise to ~0 like the associated *_min variable which also
>> gates updating best.
> Considering the affected functions are both __init, this change isn't a big
> problem. But if you were truly concerned of the one theoretical case, you
> can't get away with this either: If the variables really remained unwritten,
> by returning ~0 you'd end up confusing the caller.

The fact this is a crap API design doesn't make it ok to use undefined
behaviour.

Getting ~0 back is strictly less bad than getting stack rubble because
at least it's obviously wrong.

~Andrew
Jan Beulich Jan. 4, 2024, 3:33 p.m. UTC | #3
On 04.01.2024 15:33, Andrew Cooper wrote:
> On 04/01/2024 1:41 pm, Jan Beulich wrote:
>> On 28.12.2023 20:39, Andrew Cooper wrote:
>>> The use of uninitialised data is undefined behaviour.  At -O2 with trivial
>>> examples, both Clang and GCC delete the variable, and in the case of a
>>> function return, the caller gets whatever was stale in %rax prior to the call.
>>>
>>> Clang includes -Wuninitialized within -Wall, but GCC only includes it in
>>> -Wextra, which is not used by Xen at this time.
>>>
>>> Furthermore, the specific pattern of assigning a variable to itself in its
>>> declaration is only diagnosed by GCC with -Winit-self.  Clang does diagnoise
>>> simple forms of this pattern with a plain -Wuninitialized, but it fails to
>>> diagnose the instances in Xen that GCC manages to find.
>>>
>>> GCC, with -Wuninitialized and -Winit-self notices:
>>>
>>>   arch/x86/time.c: In function ‘read_pt_and_tsc’:
>>>   arch/x86/time.c:297:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>>>     297 |     uint32_t best = best;
>>>         |              ^~~~
>>>   arch/x86/time.c: In function ‘read_pt_and_tmcct’:
>>>   arch/x86/time.c:1022:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>>>    1022 |     uint64_t best = best;
>>>         |              ^~~~
>>>
>>> and both have logic paths where best can be returned while uninitalised.
>> I disagree. In both cases the variables are reliably set during the first
>> loop iteration.
> 
> I suggest you pay attention to the precision of the integers.
> 
> It is hard (likely prohibitively hard) to avoid entering the if(), but
> it is not impossible.

Okay, let's go into the details then. For initialization to be skipped,
two successive rdtsc_ordered() (taking read_pt_and_tsc() as reference)
need to return values 2^^64-1 apart. How do you see that happening, when
(iirc) we've been in agreement more than once that TSC rollover isn't
possible with current or foreseeable hardware?

Things are naturally less unlikely in read_pt_and_tmcct(), for it being
32-bit quantities there. Otoh iirc the APIC timer ticks at rate quite a
bit lower than the TSC. So 2^^32-1 ticks are still a lot of time.

> The compiler really has emitted logic paths where stack rubble is returned.

Of course, since it can't make assumptions on realistic value ranges.

>> Furthermore this initialize-to-self is a well known pattern to suppress the
>> -Wuninitialized induced warnings, originally used by Linux'es
>> uninitialized_var().
> 
> I'm glad you cited this, because it proves my point.
> 
> Notice how it was purged from Linux slowly over the course of 8 years
> because it had been shown to create real bugs, by hiding real uses of
> uninitialised variables.

I'm fully aware of this. The construct was used wrongly in too many cases.
Still I recall times where I was actively asked to use the construct.

> I'm honestly surprised that it hasn't come up yet in the MISRA work.
> 
>>  If we really want to use -Winit-self (and hence disallow
>> use of this pattern even in cases like the ones here, where they're used to
>> suppress false positive warnings), this should imo be done separately from
>> adding -Wuninitialized, and only after proper weighing of the pros and cons
>> (a wider Cc list would be required anyway for the xen/Makefile change).
> 
> There are exactly two uses of this antipattern in the entirety of Xen. 
> They are both in x86 init code.

These two instances aren't all that old. If you deem them antipatterns (I
don't, albeit I see fair room for abuse), why did you not object (suggesting
whatever better alternative)?

> Do you honestly think trying to block a patch this clear and obvious is
> going to be a good use of anyone's time.

Well, you're dong two things at a time, both of which may be clear and
obvious to you. I agree for one half, but I have reservations with the
other. Hence asking that you at least involve all REST maintainers by
Cc-ing them on the patch submission isn't a waste of time, I don't think.
And note, I'm not saying "no" to that second part of the change, but I
do see downsides alongside the upsides you (and I) see.

>>>  In
>>> both cases, initialise to ~0 like the associated *_min variable which also
>>> gates updating best.
>> Considering the affected functions are both __init, this change isn't a big
>> problem. But if you were truly concerned of the one theoretical case, you
>> can't get away with this either: If the variables really remained unwritten,
>> by returning ~0 you'd end up confusing the caller.
> 
> The fact this is a crap API design doesn't make it ok to use undefined
> behaviour.

Thank you for wording it that way.

> Getting ~0 back is strictly less bad than getting stack rubble because
> at least it's obviously wrong.

But then why not change things so there's no issue anymore? Plus I'm not
sure how / whether "obviously wrong" would manifest. I expect it would
be an entirely unobvious boot hang, or other misbehavior.

Jan
Roberto Bagnara Jan. 4, 2024, 8:43 p.m. UTC | #4
On 2024-01-04 15:33, Andrew Cooper wrote:
> On 04/01/2024 1:41 pm, Jan Beulich wrote:
>> On 28.12.2023 20:39, Andrew Cooper wrote:
>>> The use of uninitialised data is undefined behaviour.  At -O2 with trivial
>>> examples, both Clang and GCC delete the variable, and in the case of a
>>> function return, the caller gets whatever was stale in %rax prior to the call.
>>>
>>> Clang includes -Wuninitialized within -Wall, but GCC only includes it in
>>> -Wextra, which is not used by Xen at this time.
>>>
>>> Furthermore, the specific pattern of assigning a variable to itself in its
>>> declaration is only diagnosed by GCC with -Winit-self.  Clang does diagnoise
>>> simple forms of this pattern with a plain -Wuninitialized, but it fails to
>>> diagnose the instances in Xen that GCC manages to find.
>>>
>>> GCC, with -Wuninitialized and -Winit-self notices:
>>>
>>>    arch/x86/time.c: In function ‘read_pt_and_tsc’:
>>>    arch/x86/time.c:297:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>>>      297 |     uint32_t best = best;
>>>          |              ^~~~
>>>    arch/x86/time.c: In function ‘read_pt_and_tmcct’:
>>>    arch/x86/time.c:1022:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>>>     1022 |     uint64_t best = best;
>>>          |              ^~~~
>>>
>>> and both have logic paths where best can be returned while uninitalised.
>> I disagree. In both cases the variables are reliably set during the first
>> loop iteration.
> 
> I suggest you pay attention to the precision of the integers.
> 
> It is hard (likely prohibitively hard) to avoid entering the if(), but
> it is not impossible.
> 
> The compiler really has emitted logic paths where stack rubble is returned.
> 
>> Furthermore this initialize-to-self is a well known pattern to suppress the
>> -Wuninitialized induced warnings, originally used by Linux'es
>> uninitialized_var().
> 
> I'm glad you cited this, because it proves my point.
> 
> Notice how it was purged from Linux slowly over the course of 8 years
> because it had been shown to create real bugs, by hiding real uses of
> uninitialised variables.

There is a worse problem for initialize-to-self: it is undefined behavior
per se.  If this is done to suppress a warning, then what happens is
paradoxical: in order to suppress a warning about a potential undefined
behavior (the variable might indeed be always written before being read)
one introduces a definite undefined behavior.
Kind regards,

    Roberto
Tamas K Lengyel Jan. 4, 2024, 9:27 p.m. UTC | #5
> > Getting ~0 back is strictly less bad than getting stack rubble because
> > at least it's obviously wrong.
>
> But then why not change things so there's no issue anymore? Plus I'm not
> sure how / whether "obviously wrong" would manifest. I expect it would
> be an entirely unobvious boot hang, or other misbehavior.

+1 for changing these APIs to make it clear when an error happened
instead of returning magic value. Otherwise yea clearly should not use
init-to-self anywhere just to silence other warnings..

Tamas
Stefano Stabellini Jan. 5, 2024, 12:02 a.m. UTC | #6
On Thu, 4 Jan 2024, Roberto Bagnara wrote:
> On 2024-01-04 15:33, Andrew Cooper wrote:
> > On 04/01/2024 1:41 pm, Jan Beulich wrote:
> > > On 28.12.2023 20:39, Andrew Cooper wrote:
> > > > The use of uninitialised data is undefined behaviour.  At -O2 with
> > > > trivial
> > > > examples, both Clang and GCC delete the variable, and in the case of a
> > > > function return, the caller gets whatever was stale in %rax prior to the
> > > > call.
> > > > 
> > > > Clang includes -Wuninitialized within -Wall, but GCC only includes it in
> > > > -Wextra, which is not used by Xen at this time.
> > > > 
> > > > Furthermore, the specific pattern of assigning a variable to itself in
> > > > its
> > > > declaration is only diagnosed by GCC with -Winit-self.  Clang does
> > > > diagnoise
> > > > simple forms of this pattern with a plain -Wuninitialized, but it fails
> > > > to
> > > > diagnose the instances in Xen that GCC manages to find.
> > > > 
> > > > GCC, with -Wuninitialized and -Winit-self notices:
> > > > 
> > > >    arch/x86/time.c: In function ‘read_pt_and_tsc’:
> > > >    arch/x86/time.c:297:14: error: ‘best’ is used uninitialized in this
> > > > function [-Werror=uninitialized]
> > > >      297 |     uint32_t best = best;
> > > >          |              ^~~~
> > > >    arch/x86/time.c: In function ‘read_pt_and_tmcct’:
> > > >    arch/x86/time.c:1022:14: error: ‘best’ is used uninitialized in this
> > > > function [-Werror=uninitialized]
> > > >     1022 |     uint64_t best = best;
> > > >          |              ^~~~
> > > > 
> > > > and both have logic paths where best can be returned while uninitalised.
> > > I disagree. In both cases the variables are reliably set during the first
> > > loop iteration.
> > 
> > I suggest you pay attention to the precision of the integers.
> > 
> > It is hard (likely prohibitively hard) to avoid entering the if(), but
> > it is not impossible.
> > 
> > The compiler really has emitted logic paths where stack rubble is returned.
> > 
> > > Furthermore this initialize-to-self is a well known pattern to suppress
> > > the
> > > -Wuninitialized induced warnings, originally used by Linux'es
> > > uninitialized_var().
> > 
> > I'm glad you cited this, because it proves my point.
> > 
> > Notice how it was purged from Linux slowly over the course of 8 years
> > because it had been shown to create real bugs, by hiding real uses of
> > uninitialised variables.
> 
> There is a worse problem for initialize-to-self: it is undefined behavior
> per se.  If this is done to suppress a warning, then what happens is
> paradoxical: in order to suppress a warning about a potential undefined
> behavior (the variable might indeed be always written before being read)
> one introduces a definite undefined behavior.

Thanks for the insight, Roberto. I think best = best is the worst option
because it tries to suppress an uninitialized warning by introducing an
undefined behavior.

I think anything else is better, including:
- best = ~0;
- best = 0;
- some sort of uninitialized_var implementation without init-to-self

I am in favor of adding -Winit-self to the build. That's a good idea. I
don't have an opinion on whether it should be done as part of this patch
or separately. I also don't have an opinion on whether the Fixes tags
are appropriate. I would be happy with or without them.

So, I would ack this patch.

I see that updating the function to return a proper error would be good
but I wouldn't scope-creep an otherwise simple improvement, so I
wouldn't ask the contributor to do it necessarily as part of this patch.
Jan Beulich Jan. 5, 2024, 6:56 a.m. UTC | #7
On 04.01.2024 21:43, Roberto Bagnara wrote:
> On 2024-01-04 15:33, Andrew Cooper wrote:
>> On 04/01/2024 1:41 pm, Jan Beulich wrote:
>>> On 28.12.2023 20:39, Andrew Cooper wrote:
>>>> The use of uninitialised data is undefined behaviour.  At -O2 with trivial
>>>> examples, both Clang and GCC delete the variable, and in the case of a
>>>> function return, the caller gets whatever was stale in %rax prior to the call.
>>>>
>>>> Clang includes -Wuninitialized within -Wall, but GCC only includes it in
>>>> -Wextra, which is not used by Xen at this time.
>>>>
>>>> Furthermore, the specific pattern of assigning a variable to itself in its
>>>> declaration is only diagnosed by GCC with -Winit-self.  Clang does diagnoise
>>>> simple forms of this pattern with a plain -Wuninitialized, but it fails to
>>>> diagnose the instances in Xen that GCC manages to find.
>>>>
>>>> GCC, with -Wuninitialized and -Winit-self notices:
>>>>
>>>>    arch/x86/time.c: In function ‘read_pt_and_tsc’:
>>>>    arch/x86/time.c:297:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>>>>      297 |     uint32_t best = best;
>>>>          |              ^~~~
>>>>    arch/x86/time.c: In function ‘read_pt_and_tmcct’:
>>>>    arch/x86/time.c:1022:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>>>>     1022 |     uint64_t best = best;
>>>>          |              ^~~~
>>>>
>>>> and both have logic paths where best can be returned while uninitalised.
>>> I disagree. In both cases the variables are reliably set during the first
>>> loop iteration.
>>
>> I suggest you pay attention to the precision of the integers.
>>
>> It is hard (likely prohibitively hard) to avoid entering the if(), but
>> it is not impossible.
>>
>> The compiler really has emitted logic paths where stack rubble is returned.
>>
>>> Furthermore this initialize-to-self is a well known pattern to suppress the
>>> -Wuninitialized induced warnings, originally used by Linux'es
>>> uninitialized_var().
>>
>> I'm glad you cited this, because it proves my point.
>>
>> Notice how it was purged from Linux slowly over the course of 8 years
>> because it had been shown to create real bugs, by hiding real uses of
>> uninitialised variables.
> 
> There is a worse problem for initialize-to-self: it is undefined behavior
> per se.  If this is done to suppress a warning, then what happens is
> paradoxical: in order to suppress a warning about a potential undefined
> behavior (the variable might indeed be always written before being read)
> one introduces a definite undefined behavior.

I don't think so - aiui this is another of the many gcc extensions to the
language (no code is generated for this type of initialization, iirc).

Jan
Jan Beulich Jan. 5, 2024, 7 a.m. UTC | #8
On 04.01.2024 16:33, Jan Beulich wrote:
> On 04.01.2024 15:33, Andrew Cooper wrote:
>> On 04/01/2024 1:41 pm, Jan Beulich wrote:
>>> Furthermore this initialize-to-self is a well known pattern to suppress the
>>> -Wuninitialized induced warnings, originally used by Linux'es
>>> uninitialized_var().
>>
>> I'm glad you cited this, because it proves my point.
>>
>> Notice how it was purged from Linux slowly over the course of 8 years
>> because it had been shown to create real bugs, by hiding real uses of
>> uninitialised variables.
> 
> I'm fully aware of this. The construct was used wrongly in too many cases.
> Still I recall times where I was actively asked to use the construct.

And, btw: The construct was (meant to be) used in particular for suppressing
false-positive -Wuninitialized warnings on, in particular, older gcc. Which
means that by enabling this option we may also subsequently need to deal with
fallout. I'm not convinced we want to uniformly do so by adding "normal"
initializers wherever (not really) needed.

Jan
Roberto Bagnara Jan. 5, 2024, 12:25 p.m. UTC | #9
On 2024-01-05 07:56, Jan Beulich wrote:
> On 04.01.2024 21:43, Roberto Bagnara wrote:
>> On 2024-01-04 15:33, Andrew Cooper wrote:
>>> On 04/01/2024 1:41 pm, Jan Beulich wrote:
>>>> On 28.12.2023 20:39, Andrew Cooper wrote:
>>>>> The use of uninitialised data is undefined behaviour.  At -O2 with trivial
>>>>> examples, both Clang and GCC delete the variable, and in the case of a
>>>>> function return, the caller gets whatever was stale in %rax prior to the call.
>>>>>
>>>>> Clang includes -Wuninitialized within -Wall, but GCC only includes it in
>>>>> -Wextra, which is not used by Xen at this time.
>>>>>
>>>>> Furthermore, the specific pattern of assigning a variable to itself in its
>>>>> declaration is only diagnosed by GCC with -Winit-self.  Clang does diagnoise
>>>>> simple forms of this pattern with a plain -Wuninitialized, but it fails to
>>>>> diagnose the instances in Xen that GCC manages to find.
>>>>>
>>>>> GCC, with -Wuninitialized and -Winit-self notices:
>>>>>
>>>>>     arch/x86/time.c: In function ‘read_pt_and_tsc’:
>>>>>     arch/x86/time.c:297:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>>>>>       297 |     uint32_t best = best;
>>>>>           |              ^~~~
>>>>>     arch/x86/time.c: In function ‘read_pt_and_tmcct’:
>>>>>     arch/x86/time.c:1022:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>>>>>      1022 |     uint64_t best = best;
>>>>>           |              ^~~~
>>>>>
>>>>> and both have logic paths where best can be returned while uninitalised.
>>>> I disagree. In both cases the variables are reliably set during the first
>>>> loop iteration.
>>>
>>> I suggest you pay attention to the precision of the integers.
>>>
>>> It is hard (likely prohibitively hard) to avoid entering the if(), but
>>> it is not impossible.
>>>
>>> The compiler really has emitted logic paths where stack rubble is returned.
>>>
>>>> Furthermore this initialize-to-self is a well known pattern to suppress the
>>>> -Wuninitialized induced warnings, originally used by Linux'es
>>>> uninitialized_var().
>>>
>>> I'm glad you cited this, because it proves my point.
>>>
>>> Notice how it was purged from Linux slowly over the course of 8 years
>>> because it had been shown to create real bugs, by hiding real uses of
>>> uninitialised variables.
>>
>> There is a worse problem for initialize-to-self: it is undefined behavior
>> per se.  If this is done to suppress a warning, then what happens is
>> paradoxical: in order to suppress a warning about a potential undefined
>> behavior (the variable might indeed be always written before being read)
>> one introduces a definite undefined behavior.
> 
> I don't think so - aiui this is another of the many gcc extensions to the
> language (no code is generated for this type of initialization, iirc).

Well, it is undefined behavior for the C Standard: "undefined behavior"
is a technical term (see, e.g., C18 3.4.3p1) that is not dependent on
any compiler or any implementation technique for the C language.
Any implementation can deal with undefined behavior with absolute
freedom, which includes, among infinite other possibilities, doing what
you believe it does (iyrc ;-)

In addition, the compiler is typically not the only element of the toolchain
that needs to reliably interpret and process C code: so, even if we could reach
absolute certainty about what a specific version of one compiler does in response
to one instance of undefined behavior (and I think we cannot), then we would
need to obtain similar guarantees for other tools.
Stefano Stabellini April 25, 2024, 11:01 p.m. UTC | #10
On Thu, 28 Dec 2023, Andrew Cooper wrote:
> The use of uninitialised data is undefined behaviour.  At -O2 with trivial
> examples, both Clang and GCC delete the variable, and in the case of a
> function return, the caller gets whatever was stale in %rax prior to the call.
> 
> Clang includes -Wuninitialized within -Wall, but GCC only includes it in
> -Wextra, which is not used by Xen at this time.
> 
> Furthermore, the specific pattern of assigning a variable to itself in its
> declaration is only diagnosed by GCC with -Winit-self.  Clang does diagnoise
> simple forms of this pattern with a plain -Wuninitialized, but it fails to
> diagnose the instances in Xen that GCC manages to find.
> 
> GCC, with -Wuninitialized and -Winit-self notices:
> 
>   arch/x86/time.c: In function ‘read_pt_and_tsc’:
>   arch/x86/time.c:297:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>     297 |     uint32_t best = best;
>         |              ^~~~
>   arch/x86/time.c: In function ‘read_pt_and_tmcct’:
>   arch/x86/time.c:1022:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
>    1022 |     uint64_t best = best;
>         |              ^~~~
> 
> and both have logic paths where best can be returned while uninitalised.  In
> both cases, initialise to ~0 like the associated *_min variable which also
> gates updating best.
> 
> Fixes: 23658e823238 ("x86/time: further improve TSC / CPU freq calibration accuracy")
> Fixes: 3f3906b462d5 ("x86/APIC: calibrate against platform timer when possible")
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
diff mbox series

Patch

diff --git a/xen/Makefile b/xen/Makefile
index 21832d640225..81861bc41867 100644
--- a/xen/Makefile
+++ b/xen/Makefile
@@ -393,8 +393,9 @@  CFLAGS-$(CONFIG_CC_SPLIT_SECTIONS) += -ffunction-sections -fdata-sections
 
 CFLAGS += -nostdinc -fno-builtin -fno-common
 CFLAGS += -Werror -Wredundant-decls -Wwrite-strings -Wno-pointer-arith
-CFLAGS += -Wdeclaration-after-statement
+CFLAGS += -Wdeclaration-after-statement -Wuninitialized
 $(call cc-option-add,CFLAGS,CC,-Wvla)
+$(call cc-option-add,CFLAGS,CC,-Winit-self)
 CFLAGS += -pipe -D__XEN__ -include $(srctree)/include/xen/config.h
 CFLAGS-$(CONFIG_DEBUG_INFO) += -g
 
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 6d33edd0addc..3fd90e9b78bc 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -294,7 +294,7 @@  static uint32_t __init read_pt_and_tsc(uint64_t *tsc,
                                        const struct platform_timesource *pts)
 {
     uint64_t tsc_prev = *tsc = rdtsc_ordered(), tsc_min = ~0;
-    uint32_t best = best;
+    uint32_t best = ~0;
     unsigned int i;
 
     for ( i = 0; ; ++i )
@@ -1019,7 +1019,7 @@  static u64 __init init_platform_timer(void)
 static uint64_t __init read_pt_and_tmcct(uint32_t *tmcct)
 {
     uint32_t tmcct_prev = *tmcct = apic_tmcct_read(), tmcct_min = ~0;
-    uint64_t best = best;
+    uint64_t best = ~0;
     unsigned int i;
 
     for ( i = 0; ; ++i )