diff mbox series

[v3,2/2] x86: Fix /proc/cpuinfo cpumask warning

Message ID 20221014155845.1986223-3-ajones@ventanamicro.com (mailing list archive)
State Superseded
Headers show
Series Fix /proc/cpuinfo cpumask warning | expand

Commit Message

Andrew Jones Oct. 14, 2022, 3:58 p.m. UTC
Commit 78e5a3399421 ("cpumask: fix checking valid cpu range") has
started issuing warnings[*] when cpu indices equal to nr_cpu_ids - 1
are passed to cpumask_next* functions. seq_read_iter() and cpuinfo's
start and next seq operations implement a pattern like

  n = cpumask_next(n - 1, mask);
  show(n);
  while (1) {
      ++n;
      n = cpumask_next(n - 1, mask);
      if (n >= nr_cpu_ids)
          break;
      show(n);
  }

which will issue the warning when reading /proc/cpuinfo. Ensure no
warning is generated by validating the cpu index before calling
cpumask_next().

[*] Warnings will only appear with DEBUG_PER_CPU_MAPS enabled.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Cc: Yury Norov <yury.norov@gmail.com>
---
 arch/x86/kernel/cpu/proc.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Andrew Jones Oct. 28, 2022, 7:48 a.m. UTC | #1
On Fri, Oct 14, 2022 at 05:58:45PM +0200, Andrew Jones wrote:
> Commit 78e5a3399421 ("cpumask: fix checking valid cpu range") has
> started issuing warnings[*] when cpu indices equal to nr_cpu_ids - 1
> are passed to cpumask_next* functions. seq_read_iter() and cpuinfo's
> start and next seq operations implement a pattern like
> 
>   n = cpumask_next(n - 1, mask);
>   show(n);
>   while (1) {
>       ++n;
>       n = cpumask_next(n - 1, mask);
>       if (n >= nr_cpu_ids)
>           break;
>       show(n);
>   }
> 
> which will issue the warning when reading /proc/cpuinfo. Ensure no
> warning is generated by validating the cpu index before calling
> cpumask_next().
> 
> [*] Warnings will only appear with DEBUG_PER_CPU_MAPS enabled.
> 
> Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
> Cc: Yury Norov <yury.norov@gmail.com>
> ---
>  arch/x86/kernel/cpu/proc.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
> index 099b6f0d96bd..de3f93ac6e49 100644
> --- a/arch/x86/kernel/cpu/proc.c
> +++ b/arch/x86/kernel/cpu/proc.c
> @@ -153,6 +153,9 @@ static int show_cpuinfo(struct seq_file *m, void *v)
>  
>  static void *c_start(struct seq_file *m, loff_t *pos)
>  {
> +	if (*pos == nr_cpu_ids)
> +		return NULL;
> +
>  	*pos = cpumask_next(*pos - 1, cpu_online_mask);
>  	if ((*pos) < nr_cpu_ids)
>  		return &cpu_data(*pos);
> -- 
> 2.37.3
>

Hi x86 maintainers,

I realize 78e5a3399421 has now been reverted, so this fix is no longer
urgent. I don't believe it's wrong, though, so if it's still of interest,
then please consider this a friendly ping.

Thanks,
drew
Yury Norov Oct. 28, 2022, 2:46 p.m. UTC | #2
On Fri, Oct 28, 2022 at 09:48:28AM +0200, Andrew Jones wrote:
> Hi x86 maintainers,
> 
> I realize 78e5a3399421 has now been reverted, so this fix is no longer
> urgent. I don't believe it's wrong, though, so if it's still of interest,
> then please consider this a friendly ping.
> 
> Thanks,
> drew

Hi Andrew,

I'll take it in bitmap-for-next this weekend.

Thanks,
Yury
Borislav Petkov Oct. 28, 2022, 3:03 p.m. UTC | #3
On Fri, Oct 28, 2022 at 07:46:08AM -0700, Yury Norov wrote:
> I'll take it in bitmap-for-next this weekend.

Why?
Borislav Petkov Oct. 28, 2022, 4:06 p.m. UTC | #4
On Fri, Oct 28, 2022 at 10:13:28AM -0500, Yury Norov wrote:
> Because it's related to bitmap API usage and has been revealed after
> some work in bitmaps.

So first of all, that "fix" needs to explain what exactly it is fixing.
Not "it fixes this and that warning" but why the input arg to
cpumask_next() cannot be nr_cpu_ids because... yadda yadda...

> And because nobody else cares.

Why do you assume that?

> If you're willing to move it yourself please go ahead.

If it fixes a real issue, we are taking it. And pls note that x86
patches go through the tip tree.

Thx.
Andrew Jones Oct. 31, 2022, 8:06 a.m. UTC | #5
On Fri, Oct 28, 2022 at 06:06:41PM +0200, Borislav Petkov wrote:
> On Fri, Oct 28, 2022 at 10:13:28AM -0500, Yury Norov wrote:
> > Because it's related to bitmap API usage and has been revealed after
> > some work in bitmaps.
> 
> So first of all, that "fix" needs to explain what exactly it is fixing.
> Not "it fixes this and that warning" but why the input arg to
> cpumask_next() cannot be nr_cpu_ids because... yadda yadda...

Hi Boris,

I didn't realize you were still looking for improvements to the commit
message for this patch. I could add something like,

 The valid cpumask range is [0, nr_cpu_ids) and cpumask_next() always
 returns a CPU ID greater than its input, which results in its input
 range being [-1, nr_cpu_ids - 1). Ensure showing CPU info avoids
 triggering error conditions in cpumask_next() by stopping its loop
 over CPUs when its input would be invalid.

Thanks,
drew

> 
> > And because nobody else cares.
> 
> Why do you assume that?
> 
> > If you're willing to move it yourself please go ahead.
> 
> If it fixes a real issue, we are taking it. And pls note that x86
> patches go through the tip tree.
> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette
Borislav Petkov Oct. 31, 2022, 8:58 a.m. UTC | #6
On Mon, Oct 31, 2022 at 09:06:04AM +0100, Andrew Jones wrote:
>  The valid cpumask range is [0, nr_cpu_ids) and cpumask_next() always
>  returns a CPU ID greater than its input, which results in its input
>  range being [-1, nr_cpu_ids - 1). Ensure showing CPU info avoids
>  triggering error conditions in cpumask_next() by stopping its loop

What error conditions?

What would happen if @n is outside of the valid range?
Andrew Jones Oct. 31, 2022, 10:03 a.m. UTC | #7
On Mon, Oct 31, 2022 at 09:58:57AM +0100, Borislav Petkov wrote:
> On Mon, Oct 31, 2022 at 09:06:04AM +0100, Andrew Jones wrote:
> >  The valid cpumask range is [0, nr_cpu_ids) and cpumask_next() always
> >  returns a CPU ID greater than its input, which results in its input
> >  range being [-1, nr_cpu_ids - 1). Ensure showing CPU info avoids
> >  triggering error conditions in cpumask_next() by stopping its loop
> 
> What error conditions?
> 
> What would happen if @n is outside of the valid range?

Currently (after the revert of 78e5a3399421) with DEBUG_PER_CPU_MAPS we'll
get a warning splat when the cpu is outside the range [-1, nr_cpu_ids) and
cpumask_next() will call find_next_bit() with the input plus one anyway.
find_next_bit() doesn't explicity document what happens when an input is
outside the range, but it currently returns the bitmap size without any
side effects, which means cpumask_next() will return nr_cpu_ids.
show_cpuinfo() doesn't try to show anything in that case and stops its
loop, or, IOW, things work fine now with an input of nr_cpu_ids - 1. But,
show_cpuinfo() is just getting away with a violated cpumask_next()
contract, which 78e5a3399421 exposed. How about a new commit message like
this

  seq_read_iter() and cpuinfo's start and next seq operations implement a
  pattern like

    n = cpumask_next(n - 1, mask);
    show(n);
    while (1) {
        ++n;
        n = cpumask_next(n - 1, mask);
        if (n >= nr_cpu_ids)
           break;
        show(n);
    }

  which loops until cpumask_next() identifies its CPU ID input is out of
  its valid range, [-1, nr_cpu_ids - 1). seq_read_iter() assumes the
  result of an invalid input is to return nr_cpu_ids or larger without any
  side effects, however the cpumask API does not document that and it
  reserves the right to change how it responds to invalid inputs. Ensure
  inputs from seq_read_iter() are valid.

Thanks,
drew
Borislav Petkov Nov. 2, 2022, 6:44 p.m. UTC | #8
On Mon, Oct 31, 2022 at 11:03:27AM +0100, Andrew Jones wrote:
> Currently (after the revert of 78e5a3399421)

After the revert?

That commit is still in the latest Linus tree.

> with DEBUG_PER_CPU_MAPS we'll get a warning splat when the cpu is
> outside the range [-1, nr_cpu_ids)

Yah, that range makes sense.

> and cpumask_next() will call find_next_bit() with the input plus one anyway.
> find_next_bit() doesn't explicity document what happens when an input is
> outside the range, but it currently returns the bitmap size without any
> side effects, which means cpumask_next() will return nr_cpu_ids.

That is good to have in the commit message.

> show_cpuinfo() doesn't try to show anything in that case and stops its
> loop, or, IOW, things work fine now with an input of nr_cpu_ids - 1. But,
> show_cpuinfo() is just getting away with a violated cpumask_next()
> contract, which 78e5a3399421 exposed. How about a new commit message like
> this

You're making it sound more complex than it is. All you wanna say is:

"Filter out invalid cpumask_next() inputs by checking its first argument
against nr_cpu_ids because cpumask_next() will call find_next_bit() with
the input plus one but the valid range for n is [-1, nr_cpu_ids)."

But that thing with the revert above needs to be clarified first.

Thx.
Andrew Jones Nov. 3, 2022, 12:59 p.m. UTC | #9
On Wed, Nov 02, 2022 at 07:44:02PM +0100, Borislav Petkov wrote:
> On Mon, Oct 31, 2022 at 11:03:27AM +0100, Andrew Jones wrote:
> > Currently (after the revert of 78e5a3399421)
> 
> After the revert?
> 
> That commit is still in the latest Linus tree.

The revert commit is 80493877d7d0 ("Revert "cpumask: fix checking valid
cpu range".")

> 
> > with DEBUG_PER_CPU_MAPS we'll get a warning splat when the cpu is
> > outside the range [-1, nr_cpu_ids)
> 
> Yah, that range makes sense.
> 
> > and cpumask_next() will call find_next_bit() with the input plus one anyway.
> > find_next_bit() doesn't explicity document what happens when an input is
> > outside the range, but it currently returns the bitmap size without any
> > side effects, which means cpumask_next() will return nr_cpu_ids.
> 
> That is good to have in the commit message.
> 
> > show_cpuinfo() doesn't try to show anything in that case and stops its
> > loop, or, IOW, things work fine now with an input of nr_cpu_ids - 1. But,
> > show_cpuinfo() is just getting away with a violated cpumask_next()
> > contract, which 78e5a3399421 exposed. How about a new commit message like
> > this
> 
> You're making it sound more complex than it is. All you wanna say is:
> 
> "Filter out invalid cpumask_next() inputs by checking its first argument
> against nr_cpu_ids because cpumask_next() will call find_next_bit() with
> the input plus one but the valid range for n is [-1, nr_cpu_ids)."

The patch I'm proposing ensures cpumask_next()'s range, which is actually
[-1, nr_cpus_ids - 1), isn't violated. Violating that range will generate
the warning for kernels which have commit 78e5a3399421 ("cpumask: fix
checking valid cpu range"), but not its revert.

Since 78e5a3399421 has been reverted, the value of this proposed fix is
less, and indeed the warning may even go away completely for these types
of cpumask calls[1]. However, it seems reasonable for callers to implement
their own checks until the cpumask API has documented what they should
expect.

[1] https://lore.kernel.org/all/CAHk-=wihz-GXx66MmEyaADgS1fQE_LDcB9wrHAmkvXkd8nx9tA@mail.gmail.com/

> 
> But that thing with the revert above needs to be clarified first.

I'll send a v4 with another stab at the commit message.

Thanks,
drew

> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette
Borislav Petkov Nov. 3, 2022, 3:02 p.m. UTC | #10
On Thu, Nov 03, 2022 at 01:59:45PM +0100, Andrew Jones wrote:
> The patch I'm proposing ensures cpumask_next()'s range, which is actually
> [-1, nr_cpus_ids - 1),

Lemme make sure I understand it correctly: on the upper boundary, if you
supply for n the value nr_cpu_ids - 2, then it will return potentially
the last bit if the mask is set, i.e., the one at position (nr_cpu_ids - 1).

If you supply nr_cpus_ids - 1, then it'll return nr_cpu_ids to signal no
further bits set.

Yes, no?

> I'll send a v4 with another stab at the commit message.

Yes, and it is still an unreadable mess: "A kernel compiled with commit
... but not its revert... " Nope.

First make sure cpumask_next()'s valid accepted range has been settled
upon, has been explicitly documented in a comment above it and then I'll
take a patch that fixes whatever is there to fix.

Callers should not have to filter values before passing them in - the
function either returns an error or returns the next bit in the mask.

This thing:

	if (*pos == nr_cpu_ids)

but then to pass in pos - 1:

	*pos = cpumask_next(*pos - 1

looks to me like the interface needs more cooking.

Thx.
Andrew Jones Nov. 3, 2022, 3:34 p.m. UTC | #11
On Thu, Nov 03, 2022 at 04:02:12PM +0100, Borislav Petkov wrote:
> On Thu, Nov 03, 2022 at 01:59:45PM +0100, Andrew Jones wrote:
> > The patch I'm proposing ensures cpumask_next()'s range, which is actually
> > [-1, nr_cpus_ids - 1),
> 
> Lemme make sure I understand it correctly: on the upper boundary, if you
> supply for n the value nr_cpu_ids - 2, then it will return potentially
> the last bit if the mask is set, i.e., the one at position (nr_cpu_ids - 1).
> 
> If you supply nr_cpus_ids - 1, then it'll return nr_cpu_ids to signal no
> further bits set.
> 
> Yes, no?

Yes

> 
> > I'll send a v4 with another stab at the commit message.
> 
> Yes, and it is still an unreadable mess: "A kernel compiled with commit
> ... but not its revert... " Nope.
> 
> First make sure cpumask_next()'s valid accepted range has been settled
> upon, has been explicitly documented in a comment above it and then I'll
> take a patch that fixes whatever is there to fix.

That's fair, but I'll leave that to Yury.

> 
> Callers should not have to filter values before passing them in - the
> function either returns an error or returns the next bit in the mask.

That's reasonable, but cpumask folk probably need to discuss it because
not all cpumask functions have a return value where an error may be
placed.

> 
> This thing:
> 
> 	if (*pos == nr_cpu_ids)
> 
> but then to pass in pos - 1:
> 
> 	*pos = cpumask_next(*pos - 1
> 
> looks to me like the interface needs more cooking.

Indeed, but that's less of an issue with cpumask_next() than with
the way cpuinfo implements its start and next seq ops (next
unconditionally increments *pos and then calls start and start
must use *pos - 1 since the first time its called it needs to use
-1).

Thanks,
drew
Borislav Petkov Nov. 3, 2022, 3:54 p.m. UTC | #12
On Thu, Nov 03, 2022 at 04:34:04PM +0100, Andrew Jones wrote:
> Indeed, but that's less of an issue with cpumask_next() than with
> the way cpuinfo implements its start and next seq ops (next
> unconditionally increments *pos and then calls start and start
> must use *pos - 1 since the first time its called it needs to use
> -1).

Maybe because those are done wrongly...

A ->next() function should not call the ->start() function. A ->start()
function should, well, only start and nothing else.

And a ->stop() function should maybe check *pos and say whether one
should stop or not.

But I haven't looked at seq_ops at least in a decade and I have no clue
whether that would work.

I'm just looking at the function pointers and am trying to spell out
what looks most natural IMO.

IOW, maybe this should be fixed "right" and not only "made to work".

Thx.
Yury Norov Nov. 3, 2022, 4:30 p.m. UTC | #13
On Thu, Nov 03, 2022 at 04:34:04PM +0100, Andrew Jones wrote:
> On Thu, Nov 03, 2022 at 04:02:12PM +0100, Borislav Petkov wrote:
> > On Thu, Nov 03, 2022 at 01:59:45PM +0100, Andrew Jones wrote:
> > > The patch I'm proposing ensures cpumask_next()'s range, which is actually
> > > [-1, nr_cpus_ids - 1),
> > 
> > Lemme make sure I understand it correctly: on the upper boundary, if you
> > supply for n the value nr_cpu_ids - 2, then it will return potentially
> > the last bit if the mask is set, i.e., the one at position (nr_cpu_ids - 1).
> > 
> > If you supply nr_cpus_ids - 1, then it'll return nr_cpu_ids to signal no
> > further bits set.
> > 
> > Yes, no?
> 
> Yes
> 
> > 
> > > I'll send a v4 with another stab at the commit message.
> > 
> > Yes, and it is still an unreadable mess: "A kernel compiled with commit
> > ... but not its revert... " Nope.
> > 
> > First make sure cpumask_next()'s valid accepted range has been settled
> > upon, has been explicitly documented in a comment above it and then I'll
> > take a patch that fixes whatever is there to fix.
> 
> That's fair, but I'll leave that to Yury.

I'll take care of it.
 
> > Callers should not have to filter values before passing them in - the
> > function either returns an error or returns the next bit in the mask.
> 
> That's reasonable, but cpumask folk probably need to discuss it because
> not all cpumask functions have a return value where an error may be
> placed.
 
Callers should pass sane arguments into internal functions if they
expect sane output. The API not exported to userspace shouldn't
sanity-check all inputs arguments. For example, cpumask_next() doesn't
check srcp for NULL.

However, cpumask API is exposed to drivers, and that's why optional
cpumask_check() exists. (Probably. It has been done long before I took
over this.)

Current *generic* implementation guarantees that out-of-region offset
would prevent cpumask_next() from dereferencing srcp, and makes it
returning nr_cpu_ids. This behavior is expected by many callers. However,
there is a couple of non-generic cpumask implementations, and one of
them is written in assembler. So, the portable code shouldn't expect
from cpumasks more than documentation said: for a _valid_ offset
cpumask_next() returns next set bit or >= nr_cpu_ids.

cpumask_check() has been broken for years. Attempting to fix it faced
so much resistance, that I had to revert the patch. Now there's
ongoing discussion whether we need this check at all. My opinion is
that if all implementations of cpumask (more precisely, underlying
bitmap API) are safe against out-of-range offset, we can simply remove
cpumask_check(). Those users, like cpuinfo, who waste time on useless
last iteration will bear it themselves. 
 
Thanks,
Yury
Borislav Petkov Nov. 3, 2022, 4:49 p.m. UTC | #14
On Thu, Nov 03, 2022 at 09:30:54AM -0700, yury.norov@gmail.com wrote:a
> Callers should pass sane arguments into internal functions if they
> expect sane output.

What internal function? It's in a global header.

> The API not exported to userspace shouldn't sanity-check all inputs
> arguments.

That doesn't have anything to do with userspace at all.

APIs exported to the rest of the kernel should very well check their
inputs. Otherwise they're not APIs - just some random functions which
are visible to the compiler.

> So, the portable code shouldn't expect from cpumasks more than
> documentation said: for a _valid_ offset cpumask_next() returns next
> set bit or >= nr_cpu_ids.

Lemme quote from my previous mail:

"First make sure cpumask_next()'s valid accepted range has been settled
upon, has been explicitly documented"

So where is that valid range documented?

> cpumask_check() has been broken for years. Attempting to fix it faced
> so much resistance, that I had to revert the patch.

The suggestion on that thread made sense: you first fix the callers and
then the interface. Just like any other "broken" kernel API.

Nothing's stopping you from fixing it properly - it'll just take a while
and if it is such a widely used interface, you probably should come up
with a strategy first how to fix it without impacting current use.

Interfaces and their in-kernel users get refactored constantly.

Thx.
Yury Norov Nov. 3, 2022, 5:31 p.m. UTC | #15
On Thu, Nov 03, 2022 at 05:49:06PM +0100, Borislav Petkov wrote:
> On Thu, Nov 03, 2022 at 09:30:54AM -0700, yury.norov@gmail.com wrote:a
> > Callers should pass sane arguments into internal functions if they
> > expect sane output.
> 
> What internal function? It's in a global header.
> 
> > The API not exported to userspace shouldn't sanity-check all inputs
> > arguments.
> 
> That doesn't have anything to do with userspace at all.
> 
> APIs exported to the rest of the kernel should very well check their
> inputs. Otherwise they're not APIs - just some random functions which
> are visible to the compiler.

Let's take for example cpu_llc_shared_mask() added by you in
arch/x86/include/asm/smp.h recently:

  static inline struct cpumask *cpu_llc_shared_mask(int cpu)
  {
         return per_cpu(cpu_llc_shared_map, cpu);
  }

It's in a global header and available to the rest of the kernel, just as
well. How does it check its input? Maybe I lost something important in
per_cpu() internals, but at the first glance, there's no any protection
against -1, nr_cpu_ids, and other out-of-range arguments.
Borislav Petkov Nov. 3, 2022, 11:22 p.m. UTC | #16
On Thu, Nov 03, 2022 at 10:31:30AM -0700, Yury Norov wrote:
> Let's take for example cpu_llc_shared_mask() added by you in
> arch/x86/include/asm/smp.h recently:
> 
>   static inline struct cpumask *cpu_llc_shared_mask(int cpu)
>   {
>          return per_cpu(cpu_llc_shared_map, cpu);
>   }
> 
> It's in a global header and available to the rest of the kernel, just as
> well.

Just like 

static inline struct cpumask *cpu_l2c_shared_mask(int cpu)
{
        return per_cpu(cpu_l2c_shared_map, cpu);
}

should check != must check. 

But it's perfectly fine if you're going to attempt to prove some bogus
argument of yours - I can safely ignore you.
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 099b6f0d96bd..de3f93ac6e49 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -153,6 +153,9 @@  static int show_cpuinfo(struct seq_file *m, void *v)
 
 static void *c_start(struct seq_file *m, loff_t *pos)
 {
+	if (*pos == nr_cpu_ids)
+		return NULL;
+
 	*pos = cpumask_next(*pos - 1, cpu_online_mask);
 	if ((*pos) < nr_cpu_ids)
 		return &cpu_data(*pos);