diff mbox series

[for-next,3/6] xen/sched: Fix build when NR_CPUS == 1

Message ID d0922adc698ab76223d76a0a7f328a72cedf00ad.1614265718.git.connojdavis@gmail.com (mailing list archive)
State New
Headers show
Series Minimal build for RISCV | expand

Commit Message

Connor Davis Feb. 25, 2021, 3:24 p.m. UTC
Return from cpu_schedule_up when either cpu is 0 or
NR_CPUS == 1. This fixes the following:

core.c: In function 'cpu_schedule_up':
core.c:2769:19: error: array subscript 1 is above array bounds
of 'struct vcpu *[1]' [-Werror=array-bounds]
 2769 |     if ( idle_vcpu[cpu] == NULL )
      |

Signed-off-by: Connor Davis <connojdavis@gmail.com>
---
 xen/common/sched/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jan Beulich Feb. 25, 2021, 3:50 p.m. UTC | #1
On 25.02.2021 16:24, Connor Davis wrote:
> Return from cpu_schedule_up when either cpu is 0 or
> NR_CPUS == 1. This fixes the following:
> 
> core.c: In function 'cpu_schedule_up':
> core.c:2769:19: error: array subscript 1 is above array bounds
> of 'struct vcpu *[1]' [-Werror=array-bounds]
>  2769 |     if ( idle_vcpu[cpu] == NULL )
>       |
> 
> Signed-off-by: Connor Davis <connojdavis@gmail.com>
> ---
>  xen/common/sched/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index 9745a77eee..f5ec65bf9b 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -2763,7 +2763,7 @@ static int cpu_schedule_up(unsigned int cpu)
>      cpumask_set_cpu(cpu, &sched_res_mask);
>  
>      /* Boot CPU is dealt with later in scheduler_init(). */
> -    if ( cpu == 0 )
> +    if ( cpu == 0 || NR_CPUS == 1 )
>          return 0;
>  
>      if ( idle_vcpu[cpu] == NULL )

I'm not convinced a compiler warning is due here, and in turn
I'm not sure we want/need to work around this the way you do.
First question is whether that's just a specific compiler
version that's flawed. If it's not just a special case (e.g.
some unreleased version) we may want to think of possible
alternatives - the addition looks really odd to me.

Jan
Bob Eshleman Feb. 25, 2021, 10:55 p.m. UTC | #2
On 2/25/21 7:24 AM, Connor Davis wrote:
> Return from cpu_schedule_up when either cpu is 0 or
> NR_CPUS == 1. This fixes the following:
> 
> core.c: In function 'cpu_schedule_up':
> core.c:2769:19: error: array subscript 1 is above array bounds
> of 'struct vcpu *[1]' [-Werror=array-bounds]
>  2769 |     if ( idle_vcpu[cpu] == NULL )
>       |
> 
> Signed-off-by: Connor Davis <connojdavis@gmail.com>
> ---
>  xen/common/sched/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index 9745a77eee..f5ec65bf9b 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -2763,7 +2763,7 @@ static int cpu_schedule_up(unsigned int cpu)
>      cpumask_set_cpu(cpu, &sched_res_mask);
>  
>      /* Boot CPU is dealt with later in scheduler_init(). */
> -    if ( cpu == 0 )
> +    if ( cpu == 0 || NR_CPUS == 1 )
>          return 0;
>  
>      if ( idle_vcpu[cpu] == NULL )
> 

Interesting.  I wonder when this changed in GCC.

I haven't yet seen this issue compiling with:
  NR_CPUS=1
  ARCH=riscv64
  riscv64-unknown-linux-gnu-gcc (GCC) 10.1.0

Which version of GCC are you seeing emit this?

- Bob
Connor Davis Feb. 26, 2021, 3:01 a.m. UTC | #3
On Thu, Feb 25, 2021 at 02:55:45PM -0800, Bob Eshleman wrote:
> On 2/25/21 7:24 AM, Connor Davis wrote:
> > Return from cpu_schedule_up when either cpu is 0 or
> > NR_CPUS == 1. This fixes the following:
> > 
> > core.c: In function 'cpu_schedule_up':
> > core.c:2769:19: error: array subscript 1 is above array bounds
> > of 'struct vcpu *[1]' [-Werror=array-bounds]
> >  2769 |     if ( idle_vcpu[cpu] == NULL )
> >       |
> > 
> > Signed-off-by: Connor Davis <connojdavis@gmail.com>
> > ---
> >  xen/common/sched/core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> > index 9745a77eee..f5ec65bf9b 100644
> > --- a/xen/common/sched/core.c
> > +++ b/xen/common/sched/core.c
> > @@ -2763,7 +2763,7 @@ static int cpu_schedule_up(unsigned int cpu)
> >      cpumask_set_cpu(cpu, &sched_res_mask);
> >  
> >      /* Boot CPU is dealt with later in scheduler_init(). */
> > -    if ( cpu == 0 )
> > +    if ( cpu == 0 || NR_CPUS == 1 )
> >          return 0;
> >  
> >      if ( idle_vcpu[cpu] == NULL )
> > 
> 
> Interesting.  I wonder when this changed in GCC.
> 
> I haven't yet seen this issue compiling with:
>   NR_CPUS=1
>   ARCH=riscv64
>   riscv64-unknown-linux-gnu-gcc (GCC) 10.1.0
> 
> Which version of GCC are you seeing emit this?

The one from cloned from github.com/riscv/riscv-gnu-toolchain
in the docker container uses 10.2.0

    Connor
Connor Davis Feb. 26, 2021, 3:08 a.m. UTC | #4
On Thu, Feb 25, 2021 at 04:50:02PM +0100, Jan Beulich wrote:
> On 25.02.2021 16:24, Connor Davis wrote:
> > Return from cpu_schedule_up when either cpu is 0 or
> > NR_CPUS == 1. This fixes the following:
> > 
> > core.c: In function 'cpu_schedule_up':
> > core.c:2769:19: error: array subscript 1 is above array bounds
> > of 'struct vcpu *[1]' [-Werror=array-bounds]
> >  2769 |     if ( idle_vcpu[cpu] == NULL )
> >       |
> > 
> > Signed-off-by: Connor Davis <connojdavis@gmail.com>
> > ---
> >  xen/common/sched/core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> > index 9745a77eee..f5ec65bf9b 100644
> > --- a/xen/common/sched/core.c
> > +++ b/xen/common/sched/core.c
> > @@ -2763,7 +2763,7 @@ static int cpu_schedule_up(unsigned int cpu)
> >      cpumask_set_cpu(cpu, &sched_res_mask);
> >  
> >      /* Boot CPU is dealt with later in scheduler_init(). */
> > -    if ( cpu == 0 )
> > +    if ( cpu == 0 || NR_CPUS == 1 )
> >          return 0;
> >  
> >      if ( idle_vcpu[cpu] == NULL )
> 
> I'm not convinced a compiler warning is due here, and in turn
> I'm not sure we want/need to work around this the way you do.

It seems like a reasonable warning to me, but of course I'm open
to dealing with it in a different way.

> First question is whether that's just a specific compiler
> version that's flawed. If it's not just a special case (e.g.

The docker container uses gcc 10.2.0 from
https://github.com/riscv/riscv-gnu-toolchain

> some unreleased version) we may want to think of possible
> alternatives - the addition looks really odd to me.
> 
> Jan

    Connor
Jan Beulich Feb. 26, 2021, 8:31 a.m. UTC | #5
On 26.02.2021 04:08, Connor Davis wrote:
> On Thu, Feb 25, 2021 at 04:50:02PM +0100, Jan Beulich wrote:
>> On 25.02.2021 16:24, Connor Davis wrote:
>>> Return from cpu_schedule_up when either cpu is 0 or
>>> NR_CPUS == 1. This fixes the following:
>>>
>>> core.c: In function 'cpu_schedule_up':
>>> core.c:2769:19: error: array subscript 1 is above array bounds
>>> of 'struct vcpu *[1]' [-Werror=array-bounds]
>>>  2769 |     if ( idle_vcpu[cpu] == NULL )
>>>       |
>>>
>>> Signed-off-by: Connor Davis <connojdavis@gmail.com>
>>> ---
>>>  xen/common/sched/core.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
>>> index 9745a77eee..f5ec65bf9b 100644
>>> --- a/xen/common/sched/core.c
>>> +++ b/xen/common/sched/core.c
>>> @@ -2763,7 +2763,7 @@ static int cpu_schedule_up(unsigned int cpu)
>>>      cpumask_set_cpu(cpu, &sched_res_mask);
>>>  
>>>      /* Boot CPU is dealt with later in scheduler_init(). */
>>> -    if ( cpu == 0 )
>>> +    if ( cpu == 0 || NR_CPUS == 1 )
>>>          return 0;
>>>  
>>>      if ( idle_vcpu[cpu] == NULL )
>>
>> I'm not convinced a compiler warning is due here, and in turn
>> I'm not sure we want/need to work around this the way you do.
> 
> It seems like a reasonable warning to me, but of course I'm open
> to dealing with it in a different way.
> 
>> First question is whether that's just a specific compiler
>> version that's flawed. If it's not just a special case (e.g.
> 
> The docker container uses gcc 10.2.0 from
> https://github.com/riscv/riscv-gnu-toolchain

Ah yes, at -O2 I can observe the warning on e.g.

extern int array[N];

int test(unsigned i) {
	if(i == N - 1)
		return 0;
	return array[i];
}

when N=1. No warning appears when N=2 or higher, yet if it is
sensible to emit for N=1 then it would imo be similarly
sensible to emit in other cases. The only difference is that
when N=1, there's no i for which the array access would ever
be valid, while e.g. for N=2 there's exactly one such i.

I've tried an x86 build with NR_CPUS=1, and this hits the case
you found and a 2nd one, where behavior is even more puzzling.
For the case you've found I'd like to suggest as alternative

@@ -2769,6 +2769,12 @@ static int cpu_schedule_up(unsigned int
     if ( cpu == 0 )
         return 0;
 
+    /*
+     * Guard in particular also against the compiler suspecting out-of-bounds
+     * array accesses below when NR_CPUS=1.
+     */
+    BUG_ON(cpu >= NR_CPUS);
+
     if ( idle_vcpu[cpu] == NULL )
         vcpu_create(idle_vcpu[0]->domain, cpu);
     else

To fix the x86 build in this regard we'd additionally need
something along the lines of

--- unstable.orig/xen/arch/x86/genapic/x2apic.c
+++ unstable/xen/arch/x86/genapic/x2apic.c
@@ -54,7 +54,17 @@ static void init_apic_ldr_x2apic_cluster
     per_cpu(cluster_cpus, this_cpu) = cluster_cpus_spare;
     for_each_online_cpu ( cpu )
     {
-        if (this_cpu == cpu || x2apic_cluster(this_cpu) != x2apic_cluster(cpu))
+        if ( this_cpu == cpu )
+            continue;
+        /*
+         * Guard in particular against the compiler suspecting out-of-bounds
+         * array accesses below when NR_CPUS=1 (oddly enough with gcc 10 it
+         * is the 1st of these alone which actually helps, not the 2nd, nor
+         * are both required together there).
+         */
+        BUG_ON(this_cpu >= NR_CPUS);
+        BUG_ON(cpu >= NR_CPUS);
+        if ( x2apic_cluster(this_cpu) != x2apic_cluster(cpu) )
             continue;
         per_cpu(cluster_cpus, this_cpu) = per_cpu(cluster_cpus, cpu);
         break;

but the comment points out how strangely the compiler behaves here.
Even flipping around the two sides of the != doesn't change its
behavior. It is perhaps relevant to note here that there's no
special casing of smp_processor_id() in the NR_CPUS=1 case, so the
compiler can't infer this_cpu == 0.

Once we've settled on how to change common/sched/core.c I guess
I'll then adjust the x86-specific change accordingly and submit as
a separate fix (or I could of course also bundle both changes then).

Jan
Bob Eshleman Feb. 26, 2021, 3:21 p.m. UTC | #6
On 2/25/21 7:01 PM, Connor Davis wrote:
> On Thu, Feb 25, 2021 at 02:55:45PM -0800, Bob Eshleman wrote:
>>   riscv64-unknown-linux-gnu-gcc (GCC) 10.1.0
>>
>> Which version of GCC are you seeing emit this?
> 
> The one from cloned from github.com/riscv/riscv-gnu-toolchain
> in the docker container uses 10.2.0
> 
>     Connor
> 

The commit I pinned in the container is actually for GDB only, since
more recent versions broke when used with QEMU at the time of writing
the dockerfile (this last June).

Since I built the container some months ago and no commit pinning for
the compiler, it still contains 10.1.0 for me.

It _shouldn't_ be necessary...  but since there is a lot of dev done
on riscv-gcc, it might be worth talking about pinning the compiler
version in the container.

-Bob
Dario Faggioli Feb. 26, 2021, 4:49 p.m. UTC | #7
On Fri, 2021-02-26 at 09:31 +0100, Jan Beulich wrote:
> On 26.02.2021 04:08, Connor Davis wrote:
> > On Thu, Feb 25, 2021 at 04:50:02PM +0100, Jan Beulich wrote:
> > > On 25.02.2021 16:24, Connor Davis wrote:
> > > > index 9745a77eee..f5ec65bf9b 100644
> > > > --- a/xen/common/sched/core.c
> > > > +++ b/xen/common/sched/core.c
> > > > @@ -2763,7 +2763,7 @@ static int cpu_schedule_up(unsigned int
> > > > cpu)
> > > >      cpumask_set_cpu(cpu, &sched_res_mask);
> > > >  
> > > >      /* Boot CPU is dealt with later in scheduler_init(). */
> > > > -    if ( cpu == 0 )
> > > > +    if ( cpu == 0 || NR_CPUS == 1 )
> > > >          return 0;
> > > >  
> > > >      if ( idle_vcpu[cpu] == NULL )
> 
> @@ -2769,6 +2769,12 @@ static int cpu_schedule_up(unsigned int
>      if ( cpu == 0 )
>          return 0;
>  
> +    /*
> +     * Guard in particular also against the compiler suspecting out-
> of-bounds
> +     * array accesses below when NR_CPUS=1.
> +     */
> +    BUG_ON(cpu >= NR_CPUS);
> +
>
I would be fine with this.

Actually, I do prefer it too, over the "is index 0 or is the array
length 1" check, which I also find confusing.

Regards
Connor Davis Feb. 27, 2021, 4:17 a.m. UTC | #8
On Fri, Feb 26, 2021 at 09:31:02AM +0100, Jan Beulich wrote:
> On 26.02.2021 04:08, Connor Davis wrote:
> > On Thu, Feb 25, 2021 at 04:50:02PM +0100, Jan Beulich wrote:
> >> On 25.02.2021 16:24, Connor Davis wrote:
> >>> Return from cpu_schedule_up when either cpu is 0 or
> >>> NR_CPUS == 1. This fixes the following:
> >>>
> >>> core.c: In function 'cpu_schedule_up':
> >>> core.c:2769:19: error: array subscript 1 is above array bounds
> >>> of 'struct vcpu *[1]' [-Werror=array-bounds]
> >>>  2769 |     if ( idle_vcpu[cpu] == NULL )
> >>>       |
> >>>
> 
> Ah yes, at -O2 I can observe the warning on e.g.
> 
> extern int array[N];
> 
> int test(unsigned i) {
> 	if(i == N - 1)
> 		return 0;
> 	return array[i];
> }
> 
> when N=1. No warning appears when N=2 or higher, yet if it is
> sensible to emit for N=1 then it would imo be similarly
> sensible to emit in other cases. The only difference is that
> when N=1, there's no i for which the array access would ever
> be valid, while e.g. for N=2 there's exactly one such i.
> 
> I've tried an x86 build with NR_CPUS=1, and this hits the case
> you found and a 2nd one, where behavior is even more puzzling.
> For the case you've found I'd like to suggest as alternative
> 
> @@ -2769,6 +2769,12 @@ static int cpu_schedule_up(unsigned int
>      if ( cpu == 0 )
>          return 0;
>  
> +    /*
> +     * Guard in particular also against the compiler suspecting out-of-bounds
> +     * array accesses below when NR_CPUS=1.
> +     */
> +    BUG_ON(cpu >= NR_CPUS);
> +

Yeah I like this better than my approach.

>      if ( idle_vcpu[cpu] == NULL )
>          vcpu_create(idle_vcpu[0]->domain, cpu);
>      else
> 
> To fix the x86 build in this regard we'd additionally need
> something along the lines of
> 
> --- unstable.orig/xen/arch/x86/genapic/x2apic.c
> +++ unstable/xen/arch/x86/genapic/x2apic.c
> @@ -54,7 +54,17 @@ static void init_apic_ldr_x2apic_cluster
>      per_cpu(cluster_cpus, this_cpu) = cluster_cpus_spare;
>      for_each_online_cpu ( cpu )
>      {
> -        if (this_cpu == cpu || x2apic_cluster(this_cpu) != x2apic_cluster(cpu))
> +        if ( this_cpu == cpu )
> +            continue;
> +        /*
> +         * Guard in particular against the compiler suspecting out-of-bounds
> +         * array accesses below when NR_CPUS=1 (oddly enough with gcc 10 it
> +         * is the 1st of these alone which actually helps, not the 2nd, nor
> +         * are both required together there).
> +         */
> +        BUG_ON(this_cpu >= NR_CPUS);
> +        BUG_ON(cpu >= NR_CPUS);
> +        if ( x2apic_cluster(this_cpu) != x2apic_cluster(cpu) )
>              continue;
>          per_cpu(cluster_cpus, this_cpu) = per_cpu(cluster_cpus, cpu);
>          break;
> 
> but the comment points out how strangely the compiler behaves here.
> Even flipping around the two sides of the != doesn't change its
> behavior. It is perhaps relevant to note here that there's no
> special casing of smp_processor_id() in the NR_CPUS=1 case, so the
> compiler can't infer this_cpu == 0.
> 
> Once we've settled on how to change common/sched/core.c I guess
> I'll then adjust the x86-specific change accordingly and submit as
> a separate fix (or I could of course also bundle both changes then).

Feel free to bundle both.

    Connor
diff mbox series

Patch

diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 9745a77eee..f5ec65bf9b 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -2763,7 +2763,7 @@  static int cpu_schedule_up(unsigned int cpu)
     cpumask_set_cpu(cpu, &sched_res_mask);
 
     /* Boot CPU is dealt with later in scheduler_init(). */
-    if ( cpu == 0 )
+    if ( cpu == 0 || NR_CPUS == 1 )
         return 0;
 
     if ( idle_vcpu[cpu] == NULL )