Message ID | a47962a573eab38991a8aa3e09ca6b8bd3193b6c.1695681330.git.sanastasio@raptorengineering.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Fix Power CI build | expand |
Adding Henry as this issue is currently causing gitlab-ci failures. I think we should fix it as soon as possible to get clearer results for the 4.18 release. (This comment applies to the rest of the series as well). On Mon, 25 Sep 2023, Shawn Anastasio wrote: > When building for Power with CONFIG_DEBUG unset, a compiler error gets > raised inside page_alloc.c's node_to_scrub function, likely due to the > increased optimization level: > > common/page_alloc.c: In function 'node_to_scrub.part.0': > common/page_alloc.c:1217:29: error: array subscript 1 is above array > bounds of 'long unsigned int[1]' [-Werror=array-bounds] > 1217 | if ( node_need_scrub[node] ) > > It appears that this is a false positive, given that in practice > cycle_node should never return a node ID >= MAX_NUMNODES as long as the > architecture's node_online_map is properly defined and initialized, so > this additional bounds check is only to satisfy GCC. > > Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> > --- > xen/common/page_alloc.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c > index 35d9a26fa6..6df2a223e1 100644 > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -1211,6 +1211,9 @@ static unsigned int node_to_scrub(bool get_node) > } while ( !cpumask_empty(&node_to_cpumask(node)) && > (node != local_node) ); > > + if ( node >= MAX_NUMNODES ) > + break; > + > if ( node == local_node ) > break; > > -- > 2.30.2 >
Hi Stefano, > On Sep 26, 2023, at 07:04, Stefano Stabellini <sstabellini@kernel.org> wrote: > > Adding Henry as this issue is currently causing gitlab-ci failures. I > think we should fix it as soon as possible to get clearer results for > the 4.18 release. (This comment applies to the rest of the series as > well). Thanks, the CI should indeed be fixed asap. > > > On Mon, 25 Sep 2023, Shawn Anastasio wrote: >> When building for Power with CONFIG_DEBUG unset, a compiler error gets >> raised inside page_alloc.c's node_to_scrub function, likely due to the >> increased optimization level: >> >> common/page_alloc.c: In function 'node_to_scrub.part.0': >> common/page_alloc.c:1217:29: error: array subscript 1 is above array >> bounds of 'long unsigned int[1]' [-Werror=array-bounds] >> 1217 | if ( node_need_scrub[node] ) >> >> It appears that this is a false positive, given that in practice >> cycle_node should never return a node ID >= MAX_NUMNODES as long as the >> architecture's node_online_map is properly defined and initialized, so >> this additional bounds check is only to satisfy GCC. >> >> Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com> > > Acked-by: Stefano Stabellini <sstabellini@kernel.org> Release-acked-by: Henry Wang <Henry.Wang@arm.com> Kind regards, Henry > > >> --- >> xen/common/page_alloc.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c >> index 35d9a26fa6..6df2a223e1 100644 >> --- a/xen/common/page_alloc.c >> +++ b/xen/common/page_alloc.c >> @@ -1211,6 +1211,9 @@ static unsigned int node_to_scrub(bool get_node) >> } while ( !cpumask_empty(&node_to_cpumask(node)) && >> (node != local_node) ); >> >> + if ( node >= MAX_NUMNODES ) >> + break; >> + >> if ( node == local_node ) >> break; >> >> -- >> 2.30.2 >>
On 26.09.2023 00:42, Shawn Anastasio wrote: > When building for Power with CONFIG_DEBUG unset, a compiler error gets > raised inside page_alloc.c's node_to_scrub function, likely due to the > increased optimization level: > > common/page_alloc.c: In function 'node_to_scrub.part.0': > common/page_alloc.c:1217:29: error: array subscript 1 is above array > bounds of 'long unsigned int[1]' [-Werror=array-bounds] > 1217 | if ( node_need_scrub[node] ) That's gcc13? > It appears that this is a false positive, given that in practice > cycle_node should never return a node ID >= MAX_NUMNODES as long as the > architecture's node_online_map is properly defined and initialized, so > this additional bounds check is only to satisfy GCC. Looks very similar to the situation that c890499871cc ("timer: fix NR_CPUS=1 build with gcc13") was dealing with, just that here it's MAX_NUMNODES. I'd therefore prefer a solution similar to the one taken there, i.e. make code conditional rather than add yet more code. Otherwise, ... > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -1211,6 +1211,9 @@ static unsigned int node_to_scrub(bool get_node) > } while ( !cpumask_empty(&node_to_cpumask(node)) && > (node != local_node) ); > > + if ( node >= MAX_NUMNODES ) > + break; ... this clearly redundant check would need to gain a comment. What I'm puzzled by is that on Arm we had no reports of a similar problem, despite NUMA also not getting selected there (yet). Jan
On 26.09.2023 09:32, Jan Beulich wrote: > On 26.09.2023 00:42, Shawn Anastasio wrote: >> When building for Power with CONFIG_DEBUG unset, Hmm, depending on what gcc versions are used in CI, the above may be the reason why ... >> a compiler error gets >> raised inside page_alloc.c's node_to_scrub function, likely due to the >> increased optimization level: >> >> common/page_alloc.c: In function 'node_to_scrub.part.0': >> common/page_alloc.c:1217:29: error: array subscript 1 is above array >> bounds of 'long unsigned int[1]' [-Werror=array-bounds] >> 1217 | if ( node_need_scrub[node] ) > > That's gcc13? > >> It appears that this is a false positive, given that in practice >> cycle_node should never return a node ID >= MAX_NUMNODES as long as the >> architecture's node_online_map is properly defined and initialized, so >> this additional bounds check is only to satisfy GCC. > > Looks very similar to the situation that c890499871cc ("timer: fix > NR_CPUS=1 build with gcc13") was dealing with, just that here it's > MAX_NUMNODES. I'd therefore prefer a solution similar to the one > taken there, i.e. make code conditional rather than add yet more > code. > > Otherwise, ... > >> --- a/xen/common/page_alloc.c >> +++ b/xen/common/page_alloc.c >> @@ -1211,6 +1211,9 @@ static unsigned int node_to_scrub(bool get_node) >> } while ( !cpumask_empty(&node_to_cpumask(node)) && >> (node != local_node) ); >> >> + if ( node >= MAX_NUMNODES ) >> + break; > > ... this clearly redundant check would need to gain a comment. > > What I'm puzzled by is that on Arm we had no reports of a similar > problem, despite NUMA also not getting selected there (yet). ... this wasn't observed, yet. As far as I'm concerned, all my Arm builds are debug ones (which I may need to change). Jan
On 9/26/23 2:32 AM, Jan Beulich wrote: > On 26.09.2023 00:42, Shawn Anastasio wrote: >> When building for Power with CONFIG_DEBUG unset, a compiler error gets >> raised inside page_alloc.c's node_to_scrub function, likely due to the >> increased optimization level: >> >> common/page_alloc.c: In function 'node_to_scrub.part.0': >> common/page_alloc.c:1217:29: error: array subscript 1 is above array >> bounds of 'long unsigned int[1]' [-Werror=array-bounds] >> 1217 | if ( node_need_scrub[node] ) > > That's gcc13? > Gcc 10.2.1, actually. >> It appears that this is a false positive, given that in practice >> cycle_node should never return a node ID >= MAX_NUMNODES as long as the >> architecture's node_online_map is properly defined and initialized, so >> this additional bounds check is only to satisfy GCC. > > Looks very similar to the situation that c890499871cc ("timer: fix > NR_CPUS=1 build with gcc13") was dealing with, just that here it's > MAX_NUMNODES. I'd therefore prefer a solution similar to the one > taken there, i.e. make code conditional rather than add yet more > code. > Looking at that commit, it doesn't look like conditionalizing this code in the same way would be as clean -- it'd likely require separate conditional blocks around the initial variable declarations and the function body. > Otherwise, ... > >> --- a/xen/common/page_alloc.c >> +++ b/xen/common/page_alloc.c >> @@ -1211,6 +1211,9 @@ static unsigned int node_to_scrub(bool get_node) >> } while ( !cpumask_empty(&node_to_cpumask(node)) && >> (node != local_node) ); >> >> + if ( node >= MAX_NUMNODES ) >> + break; > > ... this clearly redundant check would need to gain a comment. > This isn't a problem. I'll submit a v2 with a commit. > What I'm puzzled by is that on Arm we had no reports of a similar > problem, despite NUMA also not getting selected there (yet). > As you pointed out in your follow-up email, it might be due to CONFIG_DEBUG. It might also be because on ppc, our implementation of find_next_bit (which is used by cycle_node) is declared static inline, so the compiler has access to its definition when doing this sort of static analysis. On Arm I believe the function lives in its own compilation unit which might preclude GCC from making any assumptions about its return value. > Jan Thanks, Shawn
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 35d9a26fa6..6df2a223e1 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -1211,6 +1211,9 @@ static unsigned int node_to_scrub(bool get_node) } while ( !cpumask_empty(&node_to_cpumask(node)) && (node != local_node) ); + if ( node >= MAX_NUMNODES ) + break; + if ( node == local_node ) break;
When building for Power with CONFIG_DEBUG unset, a compiler error gets raised inside page_alloc.c's node_to_scrub function, likely due to the increased optimization level: common/page_alloc.c: In function 'node_to_scrub.part.0': common/page_alloc.c:1217:29: error: array subscript 1 is above array bounds of 'long unsigned int[1]' [-Werror=array-bounds] 1217 | if ( node_need_scrub[node] ) It appears that this is a false positive, given that in practice cycle_node should never return a node ID >= MAX_NUMNODES as long as the architecture's node_online_map is properly defined and initialized, so this additional bounds check is only to satisfy GCC. Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com> --- xen/common/page_alloc.c | 3 +++ 1 file changed, 3 insertions(+) -- 2.30.2