diff mbox series

[XEN,v2,05/13] x86/traps: address violations of MISRA C Rule 16.3

Message ID 4f44a7b021eb4f78ccf1ce69b500b48b75df81c5.1719218291.git.federico.serafini@bugseng.com (mailing list archive)
State Superseded
Headers show
Series x86: address some violations of MISRA C Rule 16.3 | expand

Commit Message

Federico Serafini June 24, 2024, 9:04 a.m. UTC
Add break or pseudo keyword fallthrough to address violations of
MISRA C Rule 16.3: "An unconditional `break' statement shall terminate
every switch-clause".

No functional change.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
---
 xen/arch/x86/traps.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jan Beulich June 24, 2024, 3:19 p.m. UTC | #1
On 24.06.2024 11:04, Federico Serafini wrote:
> Add break or pseudo keyword fallthrough to address violations of
> MISRA C Rule 16.3: "An unconditional `break' statement shall terminate
> every switch-clause".
> 
> No functional change.
> 
> Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>

Technically the change fulfills its purpose, yet:

> @@ -1748,6 +1749,7 @@ static void io_check_error(const struct cpu_user_regs *regs)
>      {
>      case 'd': /* 'dom0' */
>          nmi_hwdom_report(_XEN_NMIREASON_io_error);
> +        fallthrough;
>      case 'i': /* 'ignore' */
>          break;
>      default:  /* 'fatal' */
> @@ -1768,6 +1770,7 @@ static void unknown_nmi_error(const struct cpu_user_regs *regs,
>      {
>      case 'd': /* 'dom0' */
>          nmi_hwdom_report(_XEN_NMIREASON_unknown);
> +        fallthrough;
>      case 'i': /* 'ignore' */
>          break;
>      default:  /* 'fatal' */

Falling through isn't really useful here. In such a case I think it would
be preferable to avoid the pseudo-keyword and use the shorter "break".

Jan
Stefano Stabellini June 25, 2024, 12:54 a.m. UTC | #2
On Mon, 24 Jun 2024, Federico Serafini wrote:
> Add break or pseudo keyword fallthrough to address violations of
> MISRA C Rule 16.3: "An unconditional `break' statement shall terminate
> every switch-clause".
> 
> No functional change.
> 
> Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
> ---
>  xen/arch/x86/traps.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 9906e874d5..cbcec3fafb 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1186,6 +1186,7 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
>  
>      default:
>          ASSERT_UNREACHABLE();
> +        break;

Please add ASSERT_UNREACHABLE to the list of "unconditional flow control
statements" that can terminate a case, in addition to break.


>      }
>  }
>  
> @@ -1748,6 +1749,7 @@ static void io_check_error(const struct cpu_user_regs *regs)
>      {
>      case 'd': /* 'dom0' */
>          nmi_hwdom_report(_XEN_NMIREASON_io_error);
> +        fallthrough;
>      case 'i': /* 'ignore' */
>          break;
>      default:  /* 'fatal' */
> @@ -1768,6 +1770,7 @@ static void unknown_nmi_error(const struct cpu_user_regs *regs,
>      {
>      case 'd': /* 'dom0' */
>          nmi_hwdom_report(_XEN_NMIREASON_unknown);
> +        fallthrough;
>      case 'i': /* 'ignore' */
>          break;
>      default:  /* 'fatal' */

These two are nice improvements
Jan Beulich June 25, 2024, 6:41 a.m. UTC | #3
On 25.06.2024 02:54, Stefano Stabellini wrote:
> On Mon, 24 Jun 2024, Federico Serafini wrote:
>> Add break or pseudo keyword fallthrough to address violations of
>> MISRA C Rule 16.3: "An unconditional `break' statement shall terminate
>> every switch-clause".
>>
>> No functional change.
>>
>> Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
>> ---
>>  xen/arch/x86/traps.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
>> index 9906e874d5..cbcec3fafb 100644
>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -1186,6 +1186,7 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
>>  
>>      default:
>>          ASSERT_UNREACHABLE();
>> +        break;
> 
> Please add ASSERT_UNREACHABLE to the list of "unconditional flow control
> statements" that can terminate a case, in addition to break.

Why? Exactly the opposite is part of the subject of a recent patch, iirc.
Simply because of the rules needing to cover both debug and release builds.

Jan
Stefano Stabellini June 26, 2024, 1:11 a.m. UTC | #4
On Tue, 25 Jun 2024, Jan Beulich wrote:
> On 25.06.2024 02:54, Stefano Stabellini wrote:
> > On Mon, 24 Jun 2024, Federico Serafini wrote:
> >> Add break or pseudo keyword fallthrough to address violations of
> >> MISRA C Rule 16.3: "An unconditional `break' statement shall terminate
> >> every switch-clause".
> >>
> >> No functional change.
> >>
> >> Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
> >> ---
> >>  xen/arch/x86/traps.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> >> index 9906e874d5..cbcec3fafb 100644
> >> --- a/xen/arch/x86/traps.c
> >> +++ b/xen/arch/x86/traps.c
> >> @@ -1186,6 +1186,7 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
> >>  
> >>      default:
> >>          ASSERT_UNREACHABLE();
> >> +        break;
> > 
> > Please add ASSERT_UNREACHABLE to the list of "unconditional flow control
> > statements" that can terminate a case, in addition to break.
> 
> Why? Exactly the opposite is part of the subject of a recent patch, iirc.
> Simply because of the rules needing to cover both debug and release builds.

The reason is that ASSERT_UNREACHABLE() might disappear from the release
build but it can still be used as a marker during static analysis. In
my view, ASSERT_UNREACHABLE() is equivalent to a noreturn function call
which has an empty implementation in release builds.

The only reason I can think of to require a break; after an
ASSERT_UNREACHABLE() would be if we think the unreachability only apply
to debug build, not release build:

- debug build: it is unreachable
- release build: it is reachable

I don't think that is meant to be possible so I think we can use
ASSERT_UNREACHABLE() as a marker.
Jan Beulich June 26, 2024, 8:11 a.m. UTC | #5
On 26.06.2024 03:11, Stefano Stabellini wrote:
> On Tue, 25 Jun 2024, Jan Beulich wrote:
>> On 25.06.2024 02:54, Stefano Stabellini wrote:
>>> On Mon, 24 Jun 2024, Federico Serafini wrote:
>>>> Add break or pseudo keyword fallthrough to address violations of
>>>> MISRA C Rule 16.3: "An unconditional `break' statement shall terminate
>>>> every switch-clause".
>>>>
>>>> No functional change.
>>>>
>>>> Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
>>>> ---
>>>>  xen/arch/x86/traps.c | 3 +++
>>>>  1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
>>>> index 9906e874d5..cbcec3fafb 100644
>>>> --- a/xen/arch/x86/traps.c
>>>> +++ b/xen/arch/x86/traps.c
>>>> @@ -1186,6 +1186,7 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
>>>>  
>>>>      default:
>>>>          ASSERT_UNREACHABLE();
>>>> +        break;
>>>
>>> Please add ASSERT_UNREACHABLE to the list of "unconditional flow control
>>> statements" that can terminate a case, in addition to break.
>>
>> Why? Exactly the opposite is part of the subject of a recent patch, iirc.
>> Simply because of the rules needing to cover both debug and release builds.
> 
> The reason is that ASSERT_UNREACHABLE() might disappear from the release
> build but it can still be used as a marker during static analysis. In
> my view, ASSERT_UNREACHABLE() is equivalent to a noreturn function call
> which has an empty implementation in release builds.
> 
> The only reason I can think of to require a break; after an
> ASSERT_UNREACHABLE() would be if we think the unreachability only apply
> to debug build, not release build:
> 
> - debug build: it is unreachable
> - release build: it is reachable
> 
> I don't think that is meant to be possible so I think we can use
> ASSERT_UNREACHABLE() as a marker.

Well. For one such an assumption takes as a prereq that a debug build will
be run through full coverage testing, i.e. all reachable paths proven to
be taken. I understand that this prereq is intended to somehow be met,
even if I'm having difficulty seeing how such a final proof would look
like: Full coverage would, to me, mean that _every_ line is reachable. Yet
clearly any ASSERT_UNREACHABLE() must never be reached.

And then not covering for such cases takes the further assumption that
debug and release builds are functionally identical. I'm afraid this would
be a wrong assumption to make:
1) We may screw up somewhere, with code wrongly enabled only in one of the
   two build modes.
2) The compiler may screw up, in particular with optimization.

Jan
Stefano Stabellini June 27, 2024, 1:53 a.m. UTC | #6
On Wed, 26 Jun 2024, Jan Beulich wrote:
> On 26.06.2024 03:11, Stefano Stabellini wrote:
> > On Tue, 25 Jun 2024, Jan Beulich wrote:
> >> On 25.06.2024 02:54, Stefano Stabellini wrote:
> >>> On Mon, 24 Jun 2024, Federico Serafini wrote:
> >>>> Add break or pseudo keyword fallthrough to address violations of
> >>>> MISRA C Rule 16.3: "An unconditional `break' statement shall terminate
> >>>> every switch-clause".
> >>>>
> >>>> No functional change.
> >>>>
> >>>> Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
> >>>> ---
> >>>>  xen/arch/x86/traps.c | 3 +++
> >>>>  1 file changed, 3 insertions(+)
> >>>>
> >>>> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> >>>> index 9906e874d5..cbcec3fafb 100644
> >>>> --- a/xen/arch/x86/traps.c
> >>>> +++ b/xen/arch/x86/traps.c
> >>>> @@ -1186,6 +1186,7 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
> >>>>  
> >>>>      default:
> >>>>          ASSERT_UNREACHABLE();
> >>>> +        break;
> >>>
> >>> Please add ASSERT_UNREACHABLE to the list of "unconditional flow control
> >>> statements" that can terminate a case, in addition to break.
> >>
> >> Why? Exactly the opposite is part of the subject of a recent patch, iirc.
> >> Simply because of the rules needing to cover both debug and release builds.
> > 
> > The reason is that ASSERT_UNREACHABLE() might disappear from the release
> > build but it can still be used as a marker during static analysis. In
> > my view, ASSERT_UNREACHABLE() is equivalent to a noreturn function call
> > which has an empty implementation in release builds.
> > 
> > The only reason I can think of to require a break; after an
> > ASSERT_UNREACHABLE() would be if we think the unreachability only apply
> > to debug build, not release build:
> > 
> > - debug build: it is unreachable
> > - release build: it is reachable
> > 
> > I don't think that is meant to be possible so I think we can use
> > ASSERT_UNREACHABLE() as a marker.
> 
> Well. For one such an assumption takes as a prereq that a debug build will
> be run through full coverage testing, i.e. all reachable paths proven to
> be taken. I understand that this prereq is intended to somehow be met,
> even if I'm having difficulty seeing how such a final proof would look
> like: Full coverage would, to me, mean that _every_ line is reachable. Yet
> clearly any ASSERT_UNREACHABLE() must never be reached.
> 
> And then not covering for such cases takes the further assumption that
> debug and release builds are functionally identical. I'm afraid this would
> be a wrong assumption to make:
> 1) We may screw up somewhere, with code wrongly enabled only in one of the
>    two build modes.
> 2) The compiler may screw up, in particular with optimization.

I think there are two different issues here we are discussing.

One issue, like you said, has to do with coverage. It is important to
mark as "unreachable" any part of the code that is indeed unreachable
so that we can account it properly when we do coverage analysis. At the
moment the only "unreachable" marker that we have is
ASSERT_UNREACHABLE(), and I am hoping we can use it as part of the
coverage analysis we'll do.

However, there is a different separate question about what to do in the
Xen code after an ASSERT_UNREACHABLE(). E.g.:

             default:
                 ASSERT_UNREACHABLE();
                 return -EPERM; /* is it better with or without this? */
             }

Leaving coverage aside, would it be better to be defensive and actually
attempt to report errors back after an ASSERT_UNREACHABLE() like in the
example? Or is it better to assume the code is actually unreachable
hence there is no need to do anything afterwards?

One one hand, being defensive sounds good, on the other hand, any code
we add after ASSERT_UNREACHABLE() is dead code which cannot be tested,
which is also not good. In this example, there is no way to test the
return -EPERM code path. We also need to consider what is the right
thing to do if Xen finds itself in an erroneous situation such as being
in an unreachable code location.

So, after thinking about it and also talking to the safety manager, I
think we should:
- implement ASSERT_UNREACHABLE with a warning in release builds
- have "return -EPERM;" or similar for defensive programming
Stefano Stabellini June 27, 2024, 1:57 a.m. UTC | #7
On Wed, 26 Jun 2024, Stefano Stabellini wrote:
> So, after thinking about it and also talking to the safety manager, I
> think we should:
> - implement ASSERT_UNREACHABLE with a warning in release builds
> - have "return -EPERM;" or similar for defensive programming

Federico, as Jan agrees already on the second point, then I withdraw all
my comments about code after ASSERT_UNREACHABLE (you can consider your
R16.3 patches with my acks fully acked).
Jan Beulich June 27, 2024, 8:11 a.m. UTC | #8
On 27.06.2024 03:53, Stefano Stabellini wrote:
> On Wed, 26 Jun 2024, Jan Beulich wrote:
>> On 26.06.2024 03:11, Stefano Stabellini wrote:
>>> On Tue, 25 Jun 2024, Jan Beulich wrote:
>>>> On 25.06.2024 02:54, Stefano Stabellini wrote:
>>>>> On Mon, 24 Jun 2024, Federico Serafini wrote:
>>>>>> Add break or pseudo keyword fallthrough to address violations of
>>>>>> MISRA C Rule 16.3: "An unconditional `break' statement shall terminate
>>>>>> every switch-clause".
>>>>>>
>>>>>> No functional change.
>>>>>>
>>>>>> Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
>>>>>> ---
>>>>>>  xen/arch/x86/traps.c | 3 +++
>>>>>>  1 file changed, 3 insertions(+)
>>>>>>
>>>>>> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
>>>>>> index 9906e874d5..cbcec3fafb 100644
>>>>>> --- a/xen/arch/x86/traps.c
>>>>>> +++ b/xen/arch/x86/traps.c
>>>>>> @@ -1186,6 +1186,7 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
>>>>>>  
>>>>>>      default:
>>>>>>          ASSERT_UNREACHABLE();
>>>>>> +        break;
>>>>>
>>>>> Please add ASSERT_UNREACHABLE to the list of "unconditional flow control
>>>>> statements" that can terminate a case, in addition to break.
>>>>
>>>> Why? Exactly the opposite is part of the subject of a recent patch, iirc.
>>>> Simply because of the rules needing to cover both debug and release builds.
>>>
>>> The reason is that ASSERT_UNREACHABLE() might disappear from the release
>>> build but it can still be used as a marker during static analysis. In
>>> my view, ASSERT_UNREACHABLE() is equivalent to a noreturn function call
>>> which has an empty implementation in release builds.
>>>
>>> The only reason I can think of to require a break; after an
>>> ASSERT_UNREACHABLE() would be if we think the unreachability only apply
>>> to debug build, not release build:
>>>
>>> - debug build: it is unreachable
>>> - release build: it is reachable
>>>
>>> I don't think that is meant to be possible so I think we can use
>>> ASSERT_UNREACHABLE() as a marker.
>>
>> Well. For one such an assumption takes as a prereq that a debug build will
>> be run through full coverage testing, i.e. all reachable paths proven to
>> be taken. I understand that this prereq is intended to somehow be met,
>> even if I'm having difficulty seeing how such a final proof would look
>> like: Full coverage would, to me, mean that _every_ line is reachable. Yet
>> clearly any ASSERT_UNREACHABLE() must never be reached.
>>
>> And then not covering for such cases takes the further assumption that
>> debug and release builds are functionally identical. I'm afraid this would
>> be a wrong assumption to make:
>> 1) We may screw up somewhere, with code wrongly enabled only in one of the
>>    two build modes.
>> 2) The compiler may screw up, in particular with optimization.
> 
> I think there are two different issues here we are discussing.
> 
> One issue, like you said, has to do with coverage. It is important to
> mark as "unreachable" any part of the code that is indeed unreachable
> so that we can account it properly when we do coverage analysis. At the
> moment the only "unreachable" marker that we have is
> ASSERT_UNREACHABLE(), and I am hoping we can use it as part of the
> coverage analysis we'll do.
> 
> However, there is a different separate question about what to do in the
> Xen code after an ASSERT_UNREACHABLE(). E.g.:
> 
>              default:
>                  ASSERT_UNREACHABLE();
>                  return -EPERM; /* is it better with or without this? */
>              }
> 
> Leaving coverage aside, would it be better to be defensive and actually
> attempt to report errors back after an ASSERT_UNREACHABLE() like in the
> example? Or is it better to assume the code is actually unreachable
> hence there is no need to do anything afterwards?
> 
> One one hand, being defensive sounds good, on the other hand, any code
> we add after ASSERT_UNREACHABLE() is dead code which cannot be tested,
> which is also not good. In this example, there is no way to test the
> return -EPERM code path. We also need to consider what is the right
> thing to do if Xen finds itself in an erroneous situation such as being
> in an unreachable code location.
> 
> So, after thinking about it and also talking to the safety manager, I
> think we should:
> - implement ASSERT_UNREACHABLE with a warning in release builds

If at all, then controlled by a default-off Kconfig setting. This would,
after all, raise the question of how ASSERT() should behave then. Imo
the two should be consistent in this regard, and NDEBUG clearly results
in the expectation that ASSERT() expands to nothing. Perhaps this is
finally the time where we need to separate NDEBUG from CONFIG_DEBUG; we
did discuss doing so before. Then in your release builds, you could
actually leave assertions active.

> - have "return -EPERM;" or similar for defensive programming

You don't say how you'd deal with the not-reachable aspect then.

Jan
Stefano Stabellini June 27, 2024, 10:59 p.m. UTC | #9
On Thu, 27 Jun 2024, Jan Beulich wrote:
> On 27.06.2024 03:53, Stefano Stabellini wrote:
> > On Wed, 26 Jun 2024, Jan Beulich wrote:
> >> On 26.06.2024 03:11, Stefano Stabellini wrote:
> >>> On Tue, 25 Jun 2024, Jan Beulich wrote:
> >>>> On 25.06.2024 02:54, Stefano Stabellini wrote:
> >>>>> On Mon, 24 Jun 2024, Federico Serafini wrote:
> >>>>>> Add break or pseudo keyword fallthrough to address violations of
> >>>>>> MISRA C Rule 16.3: "An unconditional `break' statement shall terminate
> >>>>>> every switch-clause".
> >>>>>>
> >>>>>> No functional change.
> >>>>>>
> >>>>>> Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
> >>>>>> ---
> >>>>>>  xen/arch/x86/traps.c | 3 +++
> >>>>>>  1 file changed, 3 insertions(+)
> >>>>>>
> >>>>>> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> >>>>>> index 9906e874d5..cbcec3fafb 100644
> >>>>>> --- a/xen/arch/x86/traps.c
> >>>>>> +++ b/xen/arch/x86/traps.c
> >>>>>> @@ -1186,6 +1186,7 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
> >>>>>>  
> >>>>>>      default:
> >>>>>>          ASSERT_UNREACHABLE();
> >>>>>> +        break;
> >>>>>
> >>>>> Please add ASSERT_UNREACHABLE to the list of "unconditional flow control
> >>>>> statements" that can terminate a case, in addition to break.
> >>>>
> >>>> Why? Exactly the opposite is part of the subject of a recent patch, iirc.
> >>>> Simply because of the rules needing to cover both debug and release builds.
> >>>
> >>> The reason is that ASSERT_UNREACHABLE() might disappear from the release
> >>> build but it can still be used as a marker during static analysis. In
> >>> my view, ASSERT_UNREACHABLE() is equivalent to a noreturn function call
> >>> which has an empty implementation in release builds.
> >>>
> >>> The only reason I can think of to require a break; after an
> >>> ASSERT_UNREACHABLE() would be if we think the unreachability only apply
> >>> to debug build, not release build:
> >>>
> >>> - debug build: it is unreachable
> >>> - release build: it is reachable
> >>>
> >>> I don't think that is meant to be possible so I think we can use
> >>> ASSERT_UNREACHABLE() as a marker.
> >>
> >> Well. For one such an assumption takes as a prereq that a debug build will
> >> be run through full coverage testing, i.e. all reachable paths proven to
> >> be taken. I understand that this prereq is intended to somehow be met,
> >> even if I'm having difficulty seeing how such a final proof would look
> >> like: Full coverage would, to me, mean that _every_ line is reachable. Yet
> >> clearly any ASSERT_UNREACHABLE() must never be reached.
> >>
> >> And then not covering for such cases takes the further assumption that
> >> debug and release builds are functionally identical. I'm afraid this would
> >> be a wrong assumption to make:
> >> 1) We may screw up somewhere, with code wrongly enabled only in one of the
> >>    two build modes.
> >> 2) The compiler may screw up, in particular with optimization.
> > 
> > I think there are two different issues here we are discussing.
> > 
> > One issue, like you said, has to do with coverage. It is important to
> > mark as "unreachable" any part of the code that is indeed unreachable
> > so that we can account it properly when we do coverage analysis. At the
> > moment the only "unreachable" marker that we have is
> > ASSERT_UNREACHABLE(), and I am hoping we can use it as part of the
> > coverage analysis we'll do.
> > 
> > However, there is a different separate question about what to do in the
> > Xen code after an ASSERT_UNREACHABLE(). E.g.:
> > 
> >              default:
> >                  ASSERT_UNREACHABLE();
> >                  return -EPERM; /* is it better with or without this? */
> >              }
> > 
> > Leaving coverage aside, would it be better to be defensive and actually
> > attempt to report errors back after an ASSERT_UNREACHABLE() like in the
> > example? Or is it better to assume the code is actually unreachable
> > hence there is no need to do anything afterwards?
> > 
> > One one hand, being defensive sounds good, on the other hand, any code
> > we add after ASSERT_UNREACHABLE() is dead code which cannot be tested,
> > which is also not good. In this example, there is no way to test the
> > return -EPERM code path. We also need to consider what is the right
> > thing to do if Xen finds itself in an erroneous situation such as being
> > in an unreachable code location.
> > 
> > So, after thinking about it and also talking to the safety manager, I
> > think we should:
> > - implement ASSERT_UNREACHABLE with a warning in release builds
> 
> If at all, then controlled by a default-off Kconfig setting. This would,
> after all, raise the question of how ASSERT() should behave then. Imo
> the two should be consistent in this regard, and NDEBUG clearly results
> in the expectation that ASSERT() expands to nothing. Perhaps this is
> finally the time where we need to separate NDEBUG from CONFIG_DEBUG; we
> did discuss doing so before. Then in your release builds, you could
> actually leave assertions active.
 
Yes, a kconfig to define the behavior of ASSERT_UNREACHABLE in release
builds is fine. And you are right that we should consider doing
something similar for ASSERT too.

I think that in any environment where safety (i.e. correctness of
behavior) is a primary concern, attempting to continue without doing
anything after reaching a point that is supposed to be unreachable is
not a good idea.

I think Xen should do something in response to reaching a point it is
not supposed to reach. I don't know yet what is the best course of
action but printing a warning seems to be the bare minimum.

Crashing the system is not a good idea as it could potentially be
exploited by malicious guests (security might not be the primary concern
but still.)


> > - have "return -EPERM;" or similar for defensive programming
> 
> You don't say how you'd deal with the not-reachable aspect then.

We'll have to find a way to account for all the code that cannot be
tested. We would have a problem anyway due to the ASSERT_UNREACHABLE
checks, but the addition of "return -EPERM;" will make things slightly
worse.

I have been told to prioritize safety of the code and defensive
programming over coverage calculations.
Jan Beulich June 28, 2024, 6:15 a.m. UTC | #10
On 28.06.2024 00:59, Stefano Stabellini wrote:
> On Thu, 27 Jun 2024, Jan Beulich wrote:
>> On 27.06.2024 03:53, Stefano Stabellini wrote:
>>> On Wed, 26 Jun 2024, Jan Beulich wrote:
>>>> On 26.06.2024 03:11, Stefano Stabellini wrote:
>>>>> On Tue, 25 Jun 2024, Jan Beulich wrote:
>>>>>> On 25.06.2024 02:54, Stefano Stabellini wrote:
>>>>>>> On Mon, 24 Jun 2024, Federico Serafini wrote:
>>>>>>>> Add break or pseudo keyword fallthrough to address violations of
>>>>>>>> MISRA C Rule 16.3: "An unconditional `break' statement shall terminate
>>>>>>>> every switch-clause".
>>>>>>>>
>>>>>>>> No functional change.
>>>>>>>>
>>>>>>>> Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
>>>>>>>> ---
>>>>>>>>  xen/arch/x86/traps.c | 3 +++
>>>>>>>>  1 file changed, 3 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
>>>>>>>> index 9906e874d5..cbcec3fafb 100644
>>>>>>>> --- a/xen/arch/x86/traps.c
>>>>>>>> +++ b/xen/arch/x86/traps.c
>>>>>>>> @@ -1186,6 +1186,7 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
>>>>>>>>  
>>>>>>>>      default:
>>>>>>>>          ASSERT_UNREACHABLE();
>>>>>>>> +        break;
>>>>>>>
>>>>>>> Please add ASSERT_UNREACHABLE to the list of "unconditional flow control
>>>>>>> statements" that can terminate a case, in addition to break.
>>>>>>
>>>>>> Why? Exactly the opposite is part of the subject of a recent patch, iirc.
>>>>>> Simply because of the rules needing to cover both debug and release builds.
>>>>>
>>>>> The reason is that ASSERT_UNREACHABLE() might disappear from the release
>>>>> build but it can still be used as a marker during static analysis. In
>>>>> my view, ASSERT_UNREACHABLE() is equivalent to a noreturn function call
>>>>> which has an empty implementation in release builds.
>>>>>
>>>>> The only reason I can think of to require a break; after an
>>>>> ASSERT_UNREACHABLE() would be if we think the unreachability only apply
>>>>> to debug build, not release build:
>>>>>
>>>>> - debug build: it is unreachable
>>>>> - release build: it is reachable
>>>>>
>>>>> I don't think that is meant to be possible so I think we can use
>>>>> ASSERT_UNREACHABLE() as a marker.
>>>>
>>>> Well. For one such an assumption takes as a prereq that a debug build will
>>>> be run through full coverage testing, i.e. all reachable paths proven to
>>>> be taken. I understand that this prereq is intended to somehow be met,
>>>> even if I'm having difficulty seeing how such a final proof would look
>>>> like: Full coverage would, to me, mean that _every_ line is reachable. Yet
>>>> clearly any ASSERT_UNREACHABLE() must never be reached.
>>>>
>>>> And then not covering for such cases takes the further assumption that
>>>> debug and release builds are functionally identical. I'm afraid this would
>>>> be a wrong assumption to make:
>>>> 1) We may screw up somewhere, with code wrongly enabled only in one of the
>>>>    two build modes.
>>>> 2) The compiler may screw up, in particular with optimization.
>>>
>>> I think there are two different issues here we are discussing.
>>>
>>> One issue, like you said, has to do with coverage. It is important to
>>> mark as "unreachable" any part of the code that is indeed unreachable
>>> so that we can account it properly when we do coverage analysis. At the
>>> moment the only "unreachable" marker that we have is
>>> ASSERT_UNREACHABLE(), and I am hoping we can use it as part of the
>>> coverage analysis we'll do.
>>>
>>> However, there is a different separate question about what to do in the
>>> Xen code after an ASSERT_UNREACHABLE(). E.g.:
>>>
>>>              default:
>>>                  ASSERT_UNREACHABLE();
>>>                  return -EPERM; /* is it better with or without this? */
>>>              }
>>>
>>> Leaving coverage aside, would it be better to be defensive and actually
>>> attempt to report errors back after an ASSERT_UNREACHABLE() like in the
>>> example? Or is it better to assume the code is actually unreachable
>>> hence there is no need to do anything afterwards?
>>>
>>> One one hand, being defensive sounds good, on the other hand, any code
>>> we add after ASSERT_UNREACHABLE() is dead code which cannot be tested,
>>> which is also not good. In this example, there is no way to test the
>>> return -EPERM code path. We also need to consider what is the right
>>> thing to do if Xen finds itself in an erroneous situation such as being
>>> in an unreachable code location.
>>>
>>> So, after thinking about it and also talking to the safety manager, I
>>> think we should:
>>> - implement ASSERT_UNREACHABLE with a warning in release builds
>>
>> If at all, then controlled by a default-off Kconfig setting. This would,
>> after all, raise the question of how ASSERT() should behave then. Imo
>> the two should be consistent in this regard, and NDEBUG clearly results
>> in the expectation that ASSERT() expands to nothing. Perhaps this is
>> finally the time where we need to separate NDEBUG from CONFIG_DEBUG; we
>> did discuss doing so before. Then in your release builds, you could
>> actually leave assertions active.
>  
> Yes, a kconfig to define the behavior of ASSERT_UNREACHABLE in release
> builds is fine. And you are right that we should consider doing
> something similar for ASSERT too.
> 
> I think that in any environment where safety (i.e. correctness of
> behavior) is a primary concern, attempting to continue without doing
> anything after reaching a point that is supposed to be unreachable is
> not a good idea.
> 
> I think Xen should do something in response to reaching a point it is
> not supposed to reach. I don't know yet what is the best course of
> action but printing a warning seems to be the bare minimum.
> 
> Crashing the system is not a good idea as it could potentially be
> exploited by malicious guests (security might not be the primary concern
> but still.)

Yet continuing may set the system up for much harder to understand issues,
including crashes and exploitable issues later on.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 9906e874d5..cbcec3fafb 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1186,6 +1186,7 @@  void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
 
     default:
         ASSERT_UNREACHABLE();
+        break;
     }
 }
 
@@ -1748,6 +1749,7 @@  static void io_check_error(const struct cpu_user_regs *regs)
     {
     case 'd': /* 'dom0' */
         nmi_hwdom_report(_XEN_NMIREASON_io_error);
+        fallthrough;
     case 'i': /* 'ignore' */
         break;
     default:  /* 'fatal' */
@@ -1768,6 +1770,7 @@  static void unknown_nmi_error(const struct cpu_user_regs *regs,
     {
     case 'd': /* 'dom0' */
         nmi_hwdom_report(_XEN_NMIREASON_unknown);
+        fallthrough;
     case 'i': /* 'ignore' */
         break;
     default:  /* 'fatal' */