diff mbox series

[1/7] cpumask: fix checking valid cpu range

Message ID 20220919210559.1509179-2-yury.norov@gmail.com (mailing list archive)
State Not Applicable
Headers show
Series cpumask: repair cpumask_check() | expand

Checks

Context Check Description
netdev/tree_selection success Guessing tree name failed - patch did not apply, async

Commit Message

Yury Norov Sept. 19, 2022, 9:05 p.m. UTC
The range of valid CPUs is [0, nr_cpu_ids). Some cpumask functions are
passed with a shifted CPU index, and for them, the valid range is
[-1, nr_cpu_ids-1). Currently for those functions, we check the index
against [-1, nr_cpu_ids), which is wrong.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/cpumask.h | 19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

Comments

Valentin Schneider Sept. 28, 2022, 12:18 p.m. UTC | #1
On 19/09/22 14:05, Yury Norov wrote:
> The range of valid CPUs is [0, nr_cpu_ids). Some cpumask functions are
> passed with a shifted CPU index, and for them, the valid range is
> [-1, nr_cpu_ids-1). Currently for those functions, we check the index
> against [-1, nr_cpu_ids), which is wrong.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  include/linux/cpumask.h | 19 ++++++++-----------
>  1 file changed, 8 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index e4f9136a4a63..a1cd4eb1a3d6 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -174,9 +174,8 @@ static inline unsigned int cpumask_last(const struct cpumask *srcp)
>  static inline
>  unsigned int cpumask_next(int n, const struct cpumask *srcp)
>  {
> -	/* -1 is a legal arg here. */
> -	if (n != -1)
> -		cpumask_check(n);
> +	/* n is a prior cpu */
> +	cpumask_check(n + 1);
>       return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);

I'm confused, this makes passing nr_cpu_ids-1 to cpumask_next*() trigger a
warning. The documentation does states:

* @n: the cpu prior to the place to search (ie. return will be > @n)

So n is a valid CPU number (with -1 being the exception for scan
initialization), this shouldn't exclude nr_cpu_ids-1.

IMO passing nr_cpu_ids-1 should be treated the same as passing the
last set bit in a bitmap: no warning, and returns the bitmap
size. Otherwise reaching nr_cpu_ids-1 has to be special-cased by the
calling code which seems like unnecessary boiler plate

For instance, I trigger the cpumask_check() warning there:

3d2dcab932d0:block/blk-mq.c @l2047
        if (--hctx->next_cpu_batch <= 0) {
select_cpu:
                next_cpu = cpumask_next_and(next_cpu, hctx->cpumask, <-----
                                cpu_online_mask);
                if (next_cpu >= nr_cpu_ids)
                        next_cpu = blk_mq_first_mapped_cpu(hctx);
                hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
        }

next_cpu is a valid CPU number, shifting it doesn't seem to make sense, and
we do want it to reach nr_cpu_ids-1.
Yury Norov Sept. 28, 2022, 2:49 p.m. UTC | #2
On Wed, Sep 28, 2022 at 01:18:20PM +0100, Valentin Schneider wrote:
> On 19/09/22 14:05, Yury Norov wrote:
> > The range of valid CPUs is [0, nr_cpu_ids). Some cpumask functions are
> > passed with a shifted CPU index, and for them, the valid range is
> > [-1, nr_cpu_ids-1). Currently for those functions, we check the index
> > against [-1, nr_cpu_ids), which is wrong.
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >  include/linux/cpumask.h | 19 ++++++++-----------
> >  1 file changed, 8 insertions(+), 11 deletions(-)
> >
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index e4f9136a4a63..a1cd4eb1a3d6 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -174,9 +174,8 @@ static inline unsigned int cpumask_last(const struct cpumask *srcp)
> >  static inline
> >  unsigned int cpumask_next(int n, const struct cpumask *srcp)
> >  {
> > -	/* -1 is a legal arg here. */
> > -	if (n != -1)
> > -		cpumask_check(n);
> > +	/* n is a prior cpu */
> > +	cpumask_check(n + 1);
> >       return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
> 
> I'm confused, this makes passing nr_cpu_ids-1 to cpumask_next*() trigger a
> warning. The documentation does states:
> 
> * @n: the cpu prior to the place to search (ie. return will be > @n)
> 
> So n is a valid CPU number (with -1 being the exception for scan
> initialization), this shouldn't exclude nr_cpu_ids-1.

For a regular cpumask function, like cpumask_any_but(), the valid range is
[0, nr_cpu_ids).

'Special' functions shift by 1 when call underlying find API:

  static inline
  unsigned int cpumask_next(int n, const struct cpumask *srcp)
  {
          /* n is a prior cpu */
          cpumask_check(n + 1);
          return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
  }

So, for them the valid range [0, nr_cpu_ids) must be shifted in other
direction: [-1, nr_cpu_ids-1). 

> IMO passing nr_cpu_ids-1 should be treated the same as passing the
> last set bit in a bitmap: no warning, and returns the bitmap
> size.

This is how cpumask_check() works for normal functions. For
cpumask_next() passing nr_cpu_ids-1 is the same as passing nr_cpu_ids
for cpumask_any_but(), and it should trigger warning in both cases.
(Or should not, but it's a different story.)

> calling code which seems like unnecessary boiler plate
> 
> For instance, I trigger the cpumask_check() warning there:
> 
> 3d2dcab932d0:block/blk-mq.c @l2047
>         if (--hctx->next_cpu_batch <= 0) {
> select_cpu:
>                 next_cpu = cpumask_next_and(next_cpu, hctx->cpumask, <-----
>                                 cpu_online_mask);
>                 if (next_cpu >= nr_cpu_ids)
>                         next_cpu = blk_mq_first_mapped_cpu(hctx);
>                 hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
>         }
> 
> next_cpu is a valid CPU number, shifting it doesn't seem to make sense, and
> we do want it to reach nr_cpu_ids-1.

next_cpu is a valid CPU number for all, but not for cpumask_next().
The warning is valid. If we are at the very last cpu, what for we look
for next?

The snippet above should be fixed like this:

          if (--hctx->next_cpu_batch <= 0) {
  select_cpu:
                  if (next_cpu == nr_cpu_ids - 1)
                          next_cpu = nr_cpu_ids;
                  else
                          next_cpu = cpumask_next_and(next_cpu,
                                                      hctx->cpumask,
                                                      cpu_online_mask);
                  if (next_cpu >= nr_cpu_ids)
                          next_cpu = blk_mq_first_mapped_cpu(hctx);
                  hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
          }

The original motivation for this special shifted semantics was to
avoid passing '+1' in cpumask_next() everywhere where it's used to
iterate over cpumask. This is especially ugly because it brings negative
semantics in such a simple thing like an index, and makes people confused.
It was a bad decision, but now it's so broadly used that we have to live
with it.

The strategy to mitigate this is to minimize using of that 'special'
functions. They all are cpumask_next()-like. In this series I reworked
for_each_cpu() to not use cpumask_next().

Often, cpumask_next() is a part of opencoded for_each_cpu(), and this
is relatively easy to fix. In case of blk_mq_hctx_next_cpu() that you
mentioned above, cpumask_next_and() usage looks unavoidable, and
there's nothing to do with that, except that being careful.

It didn't trigger the warning in my test setup, so I didn't fix it.
Feel free to submit a patch, if you observe the warning for yourself.

Maybe we should consider nr_cpu_ids as a special valid index for
cpumask_check(), a sign of the end of an array. This would help to
silence many warnings, like this one. For now I'm leaning towards that
it's more a hack than a meaningful change. 

Thanks,
Yury
Valentin Schneider Sept. 30, 2022, 5:04 p.m. UTC | #3
On 28/09/22 07:49, Yury Norov wrote:
> On Wed, Sep 28, 2022 at 01:18:20PM +0100, Valentin Schneider wrote:
>> On 19/09/22 14:05, Yury Norov wrote:
>> > @@ -174,9 +174,8 @@ static inline unsigned int cpumask_last(const struct cpumask *srcp)
>> >  static inline
>> >  unsigned int cpumask_next(int n, const struct cpumask *srcp)
>> >  {
>> > -	/* -1 is a legal arg here. */
>> > -	if (n != -1)
>> > -		cpumask_check(n);
>> > +	/* n is a prior cpu */
>> > +	cpumask_check(n + 1);
>> >       return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
>>
>> I'm confused, this makes passing nr_cpu_ids-1 to cpumask_next*() trigger a
>> warning. The documentation does states:
>>
>> * @n: the cpu prior to the place to search (ie. return will be > @n)
>>
>> So n is a valid CPU number (with -1 being the exception for scan
>> initialization), this shouldn't exclude nr_cpu_ids-1.
>
> For a regular cpumask function, like cpumask_any_but(), the valid range is
> [0, nr_cpu_ids).
>
> 'Special' functions shift by 1 when call underlying find API:
>
>   static inline
>   unsigned int cpumask_next(int n, const struct cpumask *srcp)
>   {
>           /* n is a prior cpu */
>           cpumask_check(n + 1);
>           return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
>   }
>
> So, for them the valid range [0, nr_cpu_ids) must be shifted in other
> direction: [-1, nr_cpu_ids-1).
>

The way I've been seeing this is that the [0, nr_cpu_ids) range is extended
to [-1, nr_cpu_ids) to accommodate for iteration starts.

>> IMO passing nr_cpu_ids-1 should be treated the same as passing the
>> last set bit in a bitmap: no warning, and returns the bitmap
>> size.
>
> This is how cpumask_check() works for normal functions. For
> cpumask_next() passing nr_cpu_ids-1 is the same as passing nr_cpu_ids
> for cpumask_any_but(), and it should trigger warning in both cases.
> (Or should not, but it's a different story.)
>
>> calling code which seems like unnecessary boiler plate
>>
>> For instance, I trigger the cpumask_check() warning there:
>>
>> 3d2dcab932d0:block/blk-mq.c @l2047
>>         if (--hctx->next_cpu_batch <= 0) {
>> select_cpu:
>>                 next_cpu = cpumask_next_and(next_cpu, hctx->cpumask, <-----
>>                                 cpu_online_mask);
>>                 if (next_cpu >= nr_cpu_ids)
>>                         next_cpu = blk_mq_first_mapped_cpu(hctx);
>>                 hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
>>         }
>>
>> next_cpu is a valid CPU number, shifting it doesn't seem to make sense, and
>> we do want it to reach nr_cpu_ids-1.
>
> next_cpu is a valid CPU number for all, but not for cpumask_next().
> The warning is valid. If we are at the very last cpu, what for we look
> for next?
>

Consider:

  nr_cpu_ids=4

  A)
  cpumask: 0.1.1.0
  CPU      0 1 2 3
  n            ^
  result: nr_cpu_ids

  B)
  cpumask: 0.0.1.1
  CPU      0 1 2 3
  n              ^
  result: nr_cpu_ids + WARN

Both scenarios are identical from a user perspective: a valid CPU number
was passed in (either from smp_processor_id() or from a previous call to
cpumask_next*()), but there are no more bits set in the cpumask. There's no
more CPUs to search for in both scenarios, but only one produces as WARN.

> The snippet above should be fixed like this:
>
>           if (--hctx->next_cpu_batch <= 0) {
>   select_cpu:
>                   if (next_cpu == nr_cpu_ids - 1)
>                           next_cpu = nr_cpu_ids;
>                   else
>                           next_cpu = cpumask_next_and(next_cpu,
>                                                       hctx->cpumask,
>                                                       cpu_online_mask);
>                   if (next_cpu >= nr_cpu_ids)
>                           next_cpu = blk_mq_first_mapped_cpu(hctx);
>                   hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
>           }
>
> The original motivation for this special shifted semantics was to
> avoid passing '+1' in cpumask_next() everywhere where it's used to
> iterate over cpumask. This is especially ugly because it brings negative
> semantics in such a simple thing like an index, and makes people confused.
> It was a bad decision, but now it's so broadly used that we have to live
> with it.
>
> The strategy to mitigate this is to minimize using of that 'special'
> functions. They all are cpumask_next()-like. In this series I reworked
> for_each_cpu() to not use cpumask_next().
>
> Often, cpumask_next() is a part of opencoded for_each_cpu(), and this
> is relatively easy to fix. In case of blk_mq_hctx_next_cpu() that you
> mentioned above, cpumask_next_and() usage looks unavoidable, and
> there's nothing to do with that, except that being careful.
>
> It didn't trigger the warning in my test setup, so I didn't fix it.
> Feel free to submit a patch, if you observe the warning for yourself.
>
> Maybe we should consider nr_cpu_ids as a special valid index for
> cpumask_check(), a sign of the end of an array. This would help to
> silence many warnings, like this one. For now I'm leaning towards that
> it's more a hack than a meaningful change.
>

I agree, we definitely want to warn for e.g.

  cpumask_set_cpu(nr_cpu_ids, ...);

Could we instead make cpumask_next*() immediately return nr_cpu_ids when
passed n=nr_cpu_ids-1?

Also, what about cpumask_next_wrap()? That uses cpumask_next() under the
hood and is bound to warn when wrapping after n=nr_cpu_ids-1, I think.


> Thanks,
> Yury
Yury Norov Oct. 1, 2022, 2:02 a.m. UTC | #4
On Fri, Sep 30, 2022 at 06:04:08PM +0100, Valentin Schneider wrote:
[...]

> > next_cpu is a valid CPU number for all, but not for cpumask_next().
> > The warning is valid. If we are at the very last cpu, what for we look
> > for next?
> >
> 
> Consider:
> 
>   nr_cpu_ids=4
> 
>   A)
>   cpumask: 0.1.1.0
>   CPU      0 1 2 3
>   n            ^
>   result: nr_cpu_ids
> 
>   B)
>   cpumask: 0.0.1.1
>   CPU      0 1 2 3
>   n              ^
>   result: nr_cpu_ids + WARN
> 
> Both scenarios are identical from a user perspective: a valid CPU number
> was passed in (either from smp_processor_id() or from a previous call to
> cpumask_next*()), but there are no more bits set in the cpumask. There's no
> more CPUs to search for in both scenarios, but only one produces as WARN.

It seems I have to repeat it for the 3rd time.

cpumask_next() takes shifted cpu index. That's why cpumask_check()
must shift the index in the other direction to keep all that
checking logic consistent.

This is a bad design, and all users of cpumask_next() must be aware of
this pitfall.
 
[...]

> > Maybe we should consider nr_cpu_ids as a special valid index for
> > cpumask_check(), a sign of the end of an array. This would help to
> > silence many warnings, like this one. For now I'm leaning towards that
> > it's more a hack than a meaningful change.
> >
> 
> I agree, we definitely want to warn for e.g.
> 
>   cpumask_set_cpu(nr_cpu_ids, ...);
> 
> Could we instead make cpumask_next*() immediately return nr_cpu_ids when
> passed n=nr_cpu_ids-1?

This is what FIND_NEXT_BIT() does. If you're suggesting to silence the
warning - what for do we need it at all?
 
> Also, what about cpumask_next_wrap()? That uses cpumask_next() under the
> hood and is bound to warn when wrapping after n=nr_cpu_ids-1, I think.

I'm working on a fix for it. Hopefully will merge it in next window.

Thanks,
Yury
diff mbox series

Patch

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index e4f9136a4a63..a1cd4eb1a3d6 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -174,9 +174,8 @@  static inline unsigned int cpumask_last(const struct cpumask *srcp)
 static inline
 unsigned int cpumask_next(int n, const struct cpumask *srcp)
 {
-	/* -1 is a legal arg here. */
-	if (n != -1)
-		cpumask_check(n);
+	/* n is a prior cpu */
+	cpumask_check(n + 1);
 	return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
 }
 
@@ -189,9 +188,8 @@  unsigned int cpumask_next(int n, const struct cpumask *srcp)
  */
 static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp)
 {
-	/* -1 is a legal arg here. */
-	if (n != -1)
-		cpumask_check(n);
+	/* n is a prior cpu */
+	cpumask_check(n + 1);
 	return find_next_zero_bit(cpumask_bits(srcp), nr_cpumask_bits, n+1);
 }
 
@@ -231,9 +229,8 @@  static inline
 unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
 		     const struct cpumask *src2p)
 {
-	/* -1 is a legal arg here. */
-	if (n != -1)
-		cpumask_check(n);
+	/* n is a prior cpu */
+	cpumask_check(n + 1);
 	return find_next_and_bit(cpumask_bits(src1p), cpumask_bits(src2p),
 		nr_cpumask_bits, n + 1);
 }
@@ -267,8 +264,8 @@  static inline
 unsigned int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap)
 {
 	cpumask_check(start);
-	if (n != -1)
-		cpumask_check(n);
+	/* n is a prior cpu */
+	cpumask_check(n + 1);
 
 	/*
 	 * Return the first available CPU when wrapping, or when starting before cpu0,