diff mbox series

[v4,1/3] cacheinfo: Add arch specific early level initializer

Message ID 20230412185759.755408-2-rrendec@redhat.com (mailing list archive)
State New, archived
Headers show
Series arch_topology: Pre-allocate cacheinfo from primary CPU | expand

Commit Message

Radu Rendec April 12, 2023, 6:57 p.m. UTC
This patch gives architecture specific code the ability to initialize
the cache level and allocate cacheinfo memory early, when cache level
initialization runs on the primary CPU for all possible CPUs.

This is part of a patch series that attempts to further the work in
commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU").
Previously, in the absence of any DT/ACPI cache info, architecture
specific cache detection and info allocation for secondary CPUs would
happen in non-preemptible context during early CPU initialization and
trigger a "BUG: sleeping function called from invalid context" splat on
an RT kernel.

More specifically, this patch adds the early_cache_level() function,
which is called by fetch_cache_info() as a fallback when the number of
cache leaves cannot be extracted from DT/ACPI. In the default generic
(weak) implementation, this new function returns -ENOENT, which
preserves the original behavior for architectures that do not implement
the function.

Since early detection can get the number of cache leaves wrong in some
cases*, additional logic is added to still call init_cache_level() later
on the secondary CPU, therefore giving the architecture specific code an
opportunity to go back and fix the initial guess. Again, the original
behavior is preserved for architectures that do not implement the new
function.

* For example, on arm64, CLIDR_EL1 detection works only when it runs on
  the current CPU. In other words, a CPU cannot detect the cache depth
  for any other CPU than itself.

Signed-off-by: Radu Rendec <rrendec@redhat.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
---
 drivers/base/cacheinfo.c  | 75 +++++++++++++++++++++++++++------------
 include/linux/cacheinfo.h |  2 ++
 2 files changed, 55 insertions(+), 22 deletions(-)

Comments

Sudeep Holla May 15, 2023, 9:36 a.m. UTC | #1
On Wed, May 10, 2023 at 12:12:07PM -0700, Ricardo Neri wrote:
> Hi,
> 
> I had posted a patchset[1] for x86 that initializes
> ci_cacheinfo(cpu)->num_leaves during SMP boot.
>

It is entirely clear to me if this is just a clean up or a fix to some
issue you faced ? Just wanted to let you know Prateek from AMD has couple
of fixes [2]

> This means that early_leaves and a late cache_leaves() are equal but
> per_cpu_cacheinfo(cpu) is never allocated. Currently, x86 does not use
> fetch_cache_info().
> 
> I think that we should check here that per_cpu_cacheinfo() has been allocated to
> take care of the case in which early and late cache leaves remain the same:
> 
> -       if (cache_leaves(cpu) <= early_leaves)
> +       if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu))
> 
> Otherwise, in v6.4-rc1 + [1] I observe a NULL pointer dereference from
> last_level_cache_is_valid().
>

I think this is different issue as Prateek was just observing wrong info
after cpuhotplug operations. But the patches manage the cpumap_populated
state better with the patches. Can you please look at that as weel ?
Ricardo Neri May 18, 2023, 1:27 a.m. UTC | #2
On Mon, May 15, 2023 at 10:36:08AM +0100, Sudeep Holla wrote:
> On Wed, May 10, 2023 at 12:12:07PM -0700, Ricardo Neri wrote:
> > Hi,
> > 
> > I had posted a patchset[1] for x86 that initializes
> > ci_cacheinfo(cpu)->num_leaves during SMP boot.
> >
> 
> It is entirely clear to me if this is just a clean up or a fix to some
> issue you faced ? Just wanted to let you know Prateek from AMD has couple
> of fixes [2]

My first patch is a bug fix. The second patch is clean up that results
from fixing the bug in patch 1.

> 
> > This means that early_leaves and a late cache_leaves() are equal but
> > per_cpu_cacheinfo(cpu) is never allocated. Currently, x86 does not use
> > fetch_cache_info().
> > 
> > I think that we should check here that per_cpu_cacheinfo() has been allocated to
> > take care of the case in which early and late cache leaves remain the same:
> > 
> > -       if (cache_leaves(cpu) <= early_leaves)
> > +       if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu))
> > 
> > Otherwise, in v6.4-rc1 + [1] I observe a NULL pointer dereference from
> > last_level_cache_is_valid().
> >
> 
> I think this is different issue as Prateek was just observing wrong info
> after cpuhotplug operations. But the patches manage the cpumap_populated
> state better with the patches. Can you please look at that as weel ?

I verified that the patches from Prateek fix a different issue. I was able
to reproduce his issue. His patches fixes it.

I still see my issue after applying Prateek's patches.
> 
> -- 
> Regards,
> Sudeep
> 
> [2] https://lore.kernel.org/all/20230508084115.1157-1-kprateek.nayak@amd.com

Thank you for the suggestion!

BR,
Ricardo
Sudeep Holla May 18, 2023, 9:34 a.m. UTC | #3
On Wed, May 17, 2023 at 06:27:03PM -0700, Ricardo Neri wrote:
> On Mon, May 15, 2023 at 10:36:08AM +0100, Sudeep Holla wrote:
> > On Wed, May 10, 2023 at 12:12:07PM -0700, Ricardo Neri wrote:
> > > Hi,
> > > 
> > > I had posted a patchset[1] for x86 that initializes
> > > ci_cacheinfo(cpu)->num_leaves during SMP boot.
> > >
> > 
> > It is entirely clear to me if this is just a clean up or a fix to some
> > issue you faced ? Just wanted to let you know Prateek from AMD has couple
> > of fixes [2]
> 
> My first patch is a bug fix. The second patch is clean up that results
> from fixing the bug in patch 1.
> 
> > 
> > > This means that early_leaves and a late cache_leaves() are equal but
> > > per_cpu_cacheinfo(cpu) is never allocated. Currently, x86 does not use
> > > fetch_cache_info().
> > > 
> > > I think that we should check here that per_cpu_cacheinfo() has been allocated to
> > > take care of the case in which early and late cache leaves remain the same:
> > > 
> > > -       if (cache_leaves(cpu) <= early_leaves)
> > > +       if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu))
> > > 
> > > Otherwise, in v6.4-rc1 + [1] I observe a NULL pointer dereference from
> > > last_level_cache_is_valid().
> > >
> > 
> > I think this is different issue as Prateek was just observing wrong info
> > after cpuhotplug operations. But the patches manage the cpumap_populated
> > state better with the patches. Can you please look at that as weel ?
> 
> I verified that the patches from Prateek fix a different issue. I was able
> to reproduce his issue. His patches fixes it.
> 
> I still see my issue after applying Prateek's patches.

Thanks, I thought it is different issue and good that you were able to test
them as well. Please post a proper patch for the NULL ptr dereference you
are hitting on x86.
Ricardo Neri May 19, 2023, 9:44 p.m. UTC | #4
On Thu, May 11, 2023 at 03:55:18PM -0400, Radu Rendec wrote:
> On Wed, 2023-05-10 at 17:00 -0700, Ricardo Neri wrote:
> > On Wed, May 10, 2023 at 04:44:49PM -0400, Radu Rendec wrote:
> > > On Wed, 2023-05-10 at 12:12 -0700, Ricardo Neri wrote:
> > > > On Wed, Apr 12, 2023 at 02:57:57PM -0400, Radu Rendec wrote:
> > > > > This patch gives architecture specific code the ability to initialize
> > > > > the cache level and allocate cacheinfo memory early, when cache level
> > > > > initialization runs on the primary CPU for all possible CPUs.
> > > [cut]
> > > > > -int detect_cache_attributes(unsigned int cpu)
> > > > > +static inline int init_level_allocate_ci(unsigned int cpu)
> > > > >  {
> > > > > -       int ret;
> > > > > +       unsigned int early_leaves = cache_leaves(cpu);
> > > > >  
> > > > >         /* Since early initialization/allocation of the cacheinfo is allowed
> > > > >          * via fetch_cache_info() and this also gets called as CPU hotplug
> > > > >          * callbacks via cacheinfo_cpu_online, the init/alloc can be skipped
> > > > >          * as it will happen only once (the cacheinfo memory is never freed).
> > > > > -        * Just populate the cacheinfo.
> > > > > +        * Just populate the cacheinfo. However, if the cacheinfo has been
> > > > > +        * allocated early through the arch-specific early_cache_level() call,
> > > > > +        * there is a chance the info is wrong (this can happen on arm64). In
> > > > > +        * that case, call init_cache_level() anyway to give the arch-specific
> > > > > +        * code a chance to make things right.
> > > > >          */
> > > > > -       if (per_cpu_cacheinfo(cpu))
> > > > > -               goto populate_leaves;
> > > > > +       if (per_cpu_cacheinfo(cpu) && !ci_cacheinfo(cpu)->early_ci_levels)
> > > > > +               return 0;
> > > > >  
> > > > >         if (init_cache_level(cpu) || !cache_leaves(cpu))
> > > > >                 return -ENOENT;
> > > > >  
> > > > > -       ret = allocate_cache_info(cpu);
> > > > > +       /*
> > > > > +        * Now that we have properly initialized the cache level info, make
> > > > > +        * sure we don't try to do that again the next time we are called
> > > > > +        * (e.g. as CPU hotplug callbacks).
> > > > > +        */
> > > > > +       ci_cacheinfo(cpu)->early_ci_levels = false;
> > > > > +
> > > > > +       if (cache_leaves(cpu) <= early_leaves)
> > > > > +               return 0;
> > > > > +
> > > > 
> > > > I had posted a patchset[1] for x86 that initializes
> > > > ci_cacheinfo(cpu)->num_leaves during SMP boot.
> > > > 
> > > > This means that early_leaves and a late cache_leaves() are equal but
> > > > per_cpu_cacheinfo(cpu) is never allocated. Currently, x86 does not use
> > > > fetch_cache_info().
> > > > 
> > > > I think that we should check here that per_cpu_cacheinfo() has been allocated to
> > > > take care of the case in which early and late cache leaves remain the same:
> > > > 
> > > > -       if (cache_leaves(cpu) <= early_leaves)
> > > > +       if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu))
> > > > 
> > > > Otherwise, in v6.4-rc1 + [1] I observe a NULL pointer dereference from
> > > > last_level_cache_is_valid().
> > > > 
> > > > I can post a patch with this fix if it makes sense.
> > > > 
> > > > [1]. https://lore.kernel.org/all/20230424001956.21434-3-ricardo.neri-calderon@linux.intel.com/
> > > 
> > > Thanks for bringing this to my attention. I need to run some tests on
> > > x86 (I did all that work/testing on arm64) and wrap my head around it.
> > > 
> > > While I don't see any problem with the fix you're proposing, I'm afraid
> > > it may circle back to the other problem I tried to fix initially. Have
> > > you tested this on an RT kernel by any chance?
> > 
> > That is a good point. I did not test on an RT kernel. I'll try that.
> 
> It looks like the flow is much simpler on x86: detect_cache_attributes()
> is called only once for each CPU, and it's called in kthread context.
> 
> I haven't tested on an RT kernel but I think it should be fine. I put a
> msleep() there and saw no issues, which means kmalloc() on RT should be
> fine as well.

I booted the realtime kernel [3] with CONFIG_PREEMPT_RT and did not observe
the BUG splat. I tried before your patchset. Were you able to reproduce on
x86? Also, I was not able to reproduce the BUG splat after your changes +
[1] + my earlier suggested patch in this thread.

> 
> > > I'm thinking that if we end up in init_level_allocate_ci() without the
> > > cacheinfo memory having been allocated earlier, we're up for a "BUG"
> > > splat on RT kernels.
> > > 
> > > If early_leaves has the right value at that point, the cacheinfo memory
> > > should be allocated early (on the primary CPU), so perhaps there's a
> > > different problem somewhere else.
> > 
> > That can work for x86, IMO. Not sure about other archs. As you mention,
> > other archs still want the chance to correct the early cache info.
> 
> You're right. I got confused for a moment because I was used to the
> arm64 flow. On x86, there is no "early" cache info per se because, as I
> already mentioned, detect_cache_attributes() is called only once for
> each CPU.

Indeed.

> 
> I was intrigued about how this worked without your changes, and I
> looked closer. Between the initialization of the early_leaves variable
> at the beginning of init_level_allocate_ci() and the comparison of
> cache_leaves(cpu) and early_leaves, init_cache_level() gets called.
> Before your changes, (struct cpu_cacheinfo).num_leaves was initialized
> to 0 and then changed in init_cache_level(). That way, early_leaves
> ended up as 0, which made the comparison evaluate to false.

Yes my changes aim to use (struct cpu_cacheinfo).num_leaves directly.

> 
> At this point I think the patch you proposed is the right way to fix
> this. I don't see any reason why it would interfere with other archs
> that really use early allocation. This new patch should probably be
> added to your series, since otherwise your other patches would
> basically "introduce" a null-pointer deref.
> 
> My only suggestion would be to add a short comment before the
> comparison, to explain that on x86 detect_cache_attributes() is called
> only once for each CPU and so early allocation is not possible but
> (struct cpu_cacheinfo).num_leaves is already initialized by the time
> detect_cache_attributes() is called.

Thank you very much for your help!

BR,
Ricardo
Radu Rendec May 19, 2023, 10:02 p.m. UTC | #5
Hi Ricardo,

On Fri, 2023-05-19 at 14:44 -0700, Ricardo Neri wrote:
> On Thu, May 11, 2023 at 03:55:18PM -0400, Radu Rendec wrote:
> > On Wed, 2023-05-10 at 17:00 -0700, Ricardo Neri wrote:
> > > On Wed, May 10, 2023 at 04:44:49PM -0400, Radu Rendec wrote:
> > > > On Wed, 2023-05-10 at 12:12 -0700, Ricardo Neri wrote:
> > > > > On Wed, Apr 12, 2023 at 02:57:57PM -0400, Radu Rendec wrote:
> > > > > > This patch gives architecture specific code the ability to initialize
> > > > > > the cache level and allocate cacheinfo memory early, when cache level
> > > > > > initialization runs on the primary CPU for all possible CPUs.
> > > > [cut]
> > > > > > -int detect_cache_attributes(unsigned int cpu)
> > > > > > +static inline int init_level_allocate_ci(unsigned int cpu)
> > > > > >  {
> > > > > > -       int ret;
> > > > > > +       unsigned int early_leaves = cache_leaves(cpu);
> > > > > >  
> > > > > >         /* Since early initialization/allocation of the cacheinfo is allowed
> > > > > >          * via fetch_cache_info() and this also gets called as CPU hotplug
> > > > > >          * callbacks via cacheinfo_cpu_online, the init/alloc can be skipped
> > > > > >          * as it will happen only once (the cacheinfo memory is never freed).
> > > > > > -        * Just populate the cacheinfo.
> > > > > > +        * Just populate the cacheinfo. However, if the cacheinfo has been
> > > > > > +        * allocated early through the arch-specific early_cache_level() call,
> > > > > > +        * there is a chance the info is wrong (this can happen on arm64). In
> > > > > > +        * that case, call init_cache_level() anyway to give the arch-specific
> > > > > > +        * code a chance to make things right.
> > > > > >          */
> > > > > > -       if (per_cpu_cacheinfo(cpu))
> > > > > > -               goto populate_leaves;
> > > > > > +       if (per_cpu_cacheinfo(cpu) && !ci_cacheinfo(cpu)->early_ci_levels)
> > > > > > +               return 0;
> > > > > >  
> > > > > >         if (init_cache_level(cpu) || !cache_leaves(cpu))
> > > > > >                 return -ENOENT;
> > > > > >  
> > > > > > -       ret = allocate_cache_info(cpu);
> > > > > > +       /*
> > > > > > +        * Now that we have properly initialized the cache level info, make
> > > > > > +        * sure we don't try to do that again the next time we are called
> > > > > > +        * (e.g. as CPU hotplug callbacks).
> > > > > > +        */
> > > > > > +       ci_cacheinfo(cpu)->early_ci_levels = false;
> > > > > > +
> > > > > > +       if (cache_leaves(cpu) <= early_leaves)
> > > > > > +               return 0;
> > > > > > +
> > > > > 
> > > > > I had posted a patchset[1] for x86 that initializes
> > > > > ci_cacheinfo(cpu)->num_leaves during SMP boot.
> > > > > 
> > > > > This means that early_leaves and a late cache_leaves() are equal but
> > > > > per_cpu_cacheinfo(cpu) is never allocated. Currently, x86 does not use
> > > > > fetch_cache_info().
> > > > > 
> > > > > I think that we should check here that per_cpu_cacheinfo() has been allocated to
> > > > > take care of the case in which early and late cache leaves remain the same:
> > > > > 
> > > > > -       if (cache_leaves(cpu) <= early_leaves)
> > > > > +       if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu))
> > > > > 
> > > > > Otherwise, in v6.4-rc1 + [1] I observe a NULL pointer dereference from
> > > > > last_level_cache_is_valid().
> > > > > 
> > > > > I can post a patch with this fix if it makes sense.
> > > > > 
> > > > > [1]. https://lore.kernel.org/all/20230424001956.21434-3-ricardo.neri-calderon@linux.intel.com/
> > > > 
> > > > Thanks for bringing this to my attention. I need to run some tests on
> > > > x86 (I did all that work/testing on arm64) and wrap my head around it.
> > > > 
> > > > While I don't see any problem with the fix you're proposing, I'm afraid
> > > > it may circle back to the other problem I tried to fix initially. Have
> > > > you tested this on an RT kernel by any chance?
> > > 
> > > That is a good point. I did not test on an RT kernel. I'll try that.
> > 
> > It looks like the flow is much simpler on x86: detect_cache_attributes()
> > is called only once for each CPU, and it's called in kthread context.
> > 
> > I haven't tested on an RT kernel but I think it should be fine. I put a
> > msleep() there and saw no issues, which means kmalloc() on RT should be
> > fine as well.
> 
> I booted the realtime kernel [3] with CONFIG_PREEMPT_RT and did not observe
> the BUG splat. I tried before your patchset. Were you able to reproduce on
> x86? Also, I was not able to reproduce the BUG splat after your changes +
> [1] + my earlier suggested patch in this thread.

Thanks for trying this out. I think the BUG splat cannot be reproduced
on x86, either with or without my fix because detect_cache_attributes()
is always called in kthread context, and that makes kmalloc() happy on
RT kernels.

At the time when I first asked you about the RT kernel, I hadn't looked
closely at the x86 code yet, and I was unaware of the (much simpler)
flow and the kthread context. Sorry for the confusion!

In any case, your test on the RT kernel validates the conclusion that
we already came to, and eliminates any trace of doubt. FWIW, I was
testing on arm64 when I found the bug and created the fix. Things are
much more complicated there, and detect_cache_attributes() is called
twice for each CPU (the second time with preemption disabled).

Best regards,
Radu
Sudeep Holla May 31, 2023, 12:22 p.m. UTC | #6
On Thu, May 18, 2023 at 10:34:14AM +0100, Sudeep Holla wrote:
> On Wed, May 17, 2023 at 06:27:03PM -0700, Ricardo Neri wrote:
> > On Mon, May 15, 2023 at 10:36:08AM +0100, Sudeep Holla wrote:
> > > On Wed, May 10, 2023 at 12:12:07PM -0700, Ricardo Neri wrote:
> > > > Hi,
> > > > 
> > > > I had posted a patchset[1] for x86 that initializes
> > > > ci_cacheinfo(cpu)->num_leaves during SMP boot.
> > > >
> > > 
> > > It is entirely clear to me if this is just a clean up or a fix to some
> > > issue you faced ? Just wanted to let you know Prateek from AMD has couple
> > > of fixes [2]
> > 
> > My first patch is a bug fix. The second patch is clean up that results
> > from fixing the bug in patch 1.
> > 
> > > 
> > > > This means that early_leaves and a late cache_leaves() are equal but
> > > > per_cpu_cacheinfo(cpu) is never allocated. Currently, x86 does not use
> > > > fetch_cache_info().
> > > > 
> > > > I think that we should check here that per_cpu_cacheinfo() has been allocated to
> > > > take care of the case in which early and late cache leaves remain the same:
> > > > 
> > > > -       if (cache_leaves(cpu) <= early_leaves)
> > > > +       if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu))
> > > > 
> > > > Otherwise, in v6.4-rc1 + [1] I observe a NULL pointer dereference from
> > > > last_level_cache_is_valid().
> > > >
> > > 
> > > I think this is different issue as Prateek was just observing wrong info
> > > after cpuhotplug operations. But the patches manage the cpumap_populated
> > > state better with the patches. Can you please look at that as weel ?
> > 
> > I verified that the patches from Prateek fix a different issue. I was able
> > to reproduce his issue. His patches fixes it.
> > 
> > I still see my issue after applying Prateek's patches.
> 
> Thanks, I thought it is different issue and good that you were able to test
> them as well. Please post a proper patch for the NULL ptr dereference you
> are hitting on x86.

Gentle ping! Are you still observing NULL ptr dereference with v6.4-rcx ?
If so, can you please post the fix as a proper patch ? Some of the patches
in v6.4-rc1 are being backported, so I prefer to have all the known issues
fixed before that happens. Sorry for the nag, but backport is the reason
I am pushing for this.
Ricardo Neri May 31, 2023, 5:03 p.m. UTC | #7
On Wed, May 31, 2023 at 01:22:01PM +0100, Sudeep Holla wrote:
> On Thu, May 18, 2023 at 10:34:14AM +0100, Sudeep Holla wrote:
> > On Wed, May 17, 2023 at 06:27:03PM -0700, Ricardo Neri wrote:
> > > On Mon, May 15, 2023 at 10:36:08AM +0100, Sudeep Holla wrote:
> > > > On Wed, May 10, 2023 at 12:12:07PM -0700, Ricardo Neri wrote:
> > > > > Hi,
> > > > > 
> > > > > I had posted a patchset[1] for x86 that initializes
> > > > > ci_cacheinfo(cpu)->num_leaves during SMP boot.
> > > > >
> > > > 
> > > > It is entirely clear to me if this is just a clean up or a fix to some
> > > > issue you faced ? Just wanted to let you know Prateek from AMD has couple
> > > > of fixes [2]
> > > 
> > > My first patch is a bug fix. The second patch is clean up that results
> > > from fixing the bug in patch 1.
> > > 
> > > > 
> > > > > This means that early_leaves and a late cache_leaves() are equal but
> > > > > per_cpu_cacheinfo(cpu) is never allocated. Currently, x86 does not use
> > > > > fetch_cache_info().
> > > > > 
> > > > > I think that we should check here that per_cpu_cacheinfo() has been allocated to
> > > > > take care of the case in which early and late cache leaves remain the same:
> > > > > 
> > > > > -       if (cache_leaves(cpu) <= early_leaves)
> > > > > +       if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu))
> > > > > 
> > > > > Otherwise, in v6.4-rc1 + [1] I observe a NULL pointer dereference from
> > > > > last_level_cache_is_valid().
> > > > >
> > > > 
> > > > I think this is different issue as Prateek was just observing wrong info
> > > > after cpuhotplug operations. But the patches manage the cpumap_populated
> > > > state better with the patches. Can you please look at that as weel ?
> > > 
> > > I verified that the patches from Prateek fix a different issue. I was able
> > > to reproduce his issue. His patches fixes it.
> > > 
> > > I still see my issue after applying Prateek's patches.
> > 
> > Thanks, I thought it is different issue and good that you were able to test
> > them as well. Please post a proper patch for the NULL ptr dereference you
> > are hitting on x86.
> 
> Gentle ping! Are you still observing NULL ptr dereference with v6.4-rcx ?

Yes, I still observe it on v6.4-rc4.

> If so, can you please post the fix as a proper patch ? Some of the patches
> in v6.4-rc1 are being backported, so I prefer to have all the known issues
> fixed before that happens. Sorry for the nag, but backport is the reason
> I am pushing for this.

Sure. Sorry for the delay. I have the patch ready and post this week. I
will post it as part my previous patches in [1].

Thanks and BR,
Ricardo
Ricardo Neri Aug. 7, 2023, 11:23 p.m. UTC | #8
On Wed, May 31, 2023 at 10:03:36AM -0700, Ricardo Neri wrote:
> On Wed, May 31, 2023 at 01:22:01PM +0100, Sudeep Holla wrote:
> > On Thu, May 18, 2023 at 10:34:14AM +0100, Sudeep Holla wrote:
> > > On Wed, May 17, 2023 at 06:27:03PM -0700, Ricardo Neri wrote:
> > > > On Mon, May 15, 2023 at 10:36:08AM +0100, Sudeep Holla wrote:
> > > > > On Wed, May 10, 2023 at 12:12:07PM -0700, Ricardo Neri wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > I had posted a patchset[1] for x86 that initializes
> > > > > > ci_cacheinfo(cpu)->num_leaves during SMP boot.
> > > > > >
> > > > > 
> > > > > It is entirely clear to me if this is just a clean up or a fix to some
> > > > > issue you faced ? Just wanted to let you know Prateek from AMD has couple
> > > > > of fixes [2]
> > > > 
> > > > My first patch is a bug fix. The second patch is clean up that results
> > > > from fixing the bug in patch 1.
> > > > 
> > > > > 
> > > > > > This means that early_leaves and a late cache_leaves() are equal but
> > > > > > per_cpu_cacheinfo(cpu) is never allocated. Currently, x86 does not use
> > > > > > fetch_cache_info().
> > > > > > 
> > > > > > I think that we should check here that per_cpu_cacheinfo() has been allocated to
> > > > > > take care of the case in which early and late cache leaves remain the same:
> > > > > > 
> > > > > > -       if (cache_leaves(cpu) <= early_leaves)
> > > > > > +       if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu))
> > > > > > 
> > > > > > Otherwise, in v6.4-rc1 + [1] I observe a NULL pointer dereference from
> > > > > > last_level_cache_is_valid().
> > > > > >
> > > > > 
> > > > > I think this is different issue as Prateek was just observing wrong info
> > > > > after cpuhotplug operations. But the patches manage the cpumap_populated
> > > > > state better with the patches. Can you please look at that as weel ?
> > > > 
> > > > I verified that the patches from Prateek fix a different issue. I was able
> > > > to reproduce his issue. His patches fixes it.
> > > > 
> > > > I still see my issue after applying Prateek's patches.
> > > 
> > > Thanks, I thought it is different issue and good that you were able to test
> > > them as well. Please post a proper patch for the NULL ptr dereference you
> > > are hitting on x86.
> > 
> > Gentle ping! Are you still observing NULL ptr dereference with v6.4-rcx ?
> 
> Yes, I still observe it on v6.4-rc4.
> 
> > If so, can you please post the fix as a proper patch ? Some of the patches
> > in v6.4-rc1 are being backported, so I prefer to have all the known issues
> > fixed before that happens. Sorry for the nag, but backport is the reason
> > I am pushing for this.
> 
> Sure. Sorry for the delay. I have the patch ready and post this week. I
> will post it as part my previous patches in [1].

I at last posted the patchet, Sudeep. You can take a look here:
https://lore.kernel.org/all/20230805012421.7002-1-ricardo.neri-calderon@linux.intel.com/

Sorry for the delay. I had to jump through various hoops before posting.

Thanks and BR,
Ricardo
diff mbox series

Patch

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index f3903d002819..d783896c8a1f 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -398,6 +398,11 @@  static void free_cache_attributes(unsigned int cpu)
 	cache_shared_cpu_map_remove(cpu);
 }
 
+int __weak early_cache_level(unsigned int cpu)
+{
+	return -ENOENT;
+}
+
 int __weak init_cache_level(unsigned int cpu)
 {
 	return -ENOENT;
@@ -423,56 +428,82 @@  int allocate_cache_info(int cpu)
 
 int fetch_cache_info(unsigned int cpu)
 {
-	struct cpu_cacheinfo *this_cpu_ci;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	unsigned int levels = 0, split_levels = 0;
 	int ret;
 
 	if (acpi_disabled) {
 		ret = init_of_cache_level(cpu);
-		if (ret < 0)
-			return ret;
 	} else {
 		ret = acpi_get_cache_info(cpu, &levels, &split_levels);
-		if (ret < 0)
+		if (!ret) {
+			this_cpu_ci->num_levels = levels;
+			/*
+			 * This assumes that:
+			 * - there cannot be any split caches (data/instruction)
+			 *   above a unified cache
+			 * - data/instruction caches come by pair
+			 */
+			this_cpu_ci->num_leaves = levels + split_levels;
+		}
+	}
+
+	if (ret || !cache_leaves(cpu)) {
+		ret = early_cache_level(cpu);
+		if (ret)
 			return ret;
 
-		this_cpu_ci = get_cpu_cacheinfo(cpu);
-		this_cpu_ci->num_levels = levels;
-		/*
-		 * This assumes that:
-		 * - there cannot be any split caches (data/instruction)
-		 *   above a unified cache
-		 * - data/instruction caches come by pair
-		 */
-		this_cpu_ci->num_leaves = levels + split_levels;
+		if (!cache_leaves(cpu))
+			return -ENOENT;
+
+		this_cpu_ci->early_ci_levels = true;
 	}
-	if (!cache_leaves(cpu))
-		return -ENOENT;
 
 	return allocate_cache_info(cpu);
 }
 
-int detect_cache_attributes(unsigned int cpu)
+static inline int init_level_allocate_ci(unsigned int cpu)
 {
-	int ret;
+	unsigned int early_leaves = cache_leaves(cpu);
 
 	/* Since early initialization/allocation of the cacheinfo is allowed
 	 * via fetch_cache_info() and this also gets called as CPU hotplug
 	 * callbacks via cacheinfo_cpu_online, the init/alloc can be skipped
 	 * as it will happen only once (the cacheinfo memory is never freed).
-	 * Just populate the cacheinfo.
+	 * Just populate the cacheinfo. However, if the cacheinfo has been
+	 * allocated early through the arch-specific early_cache_level() call,
+	 * there is a chance the info is wrong (this can happen on arm64). In
+	 * that case, call init_cache_level() anyway to give the arch-specific
+	 * code a chance to make things right.
 	 */
-	if (per_cpu_cacheinfo(cpu))
-		goto populate_leaves;
+	if (per_cpu_cacheinfo(cpu) && !ci_cacheinfo(cpu)->early_ci_levels)
+		return 0;
 
 	if (init_cache_level(cpu) || !cache_leaves(cpu))
 		return -ENOENT;
 
-	ret = allocate_cache_info(cpu);
+	/*
+	 * Now that we have properly initialized the cache level info, make
+	 * sure we don't try to do that again the next time we are called
+	 * (e.g. as CPU hotplug callbacks).
+	 */
+	ci_cacheinfo(cpu)->early_ci_levels = false;
+
+	if (cache_leaves(cpu) <= early_leaves)
+		return 0;
+
+	kfree(per_cpu_cacheinfo(cpu));
+	return allocate_cache_info(cpu);
+}
+
+int detect_cache_attributes(unsigned int cpu)
+{
+	int ret;
+
+	ret = init_level_allocate_ci(cpu);
 	if (ret)
 		return ret;
 
-populate_leaves:
 	/*
 	 * If LLC is valid the cache leaves were already populated so just go to
 	 * update the cpu map.
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 908e19d17f49..6147b2672555 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -76,9 +76,11 @@  struct cpu_cacheinfo {
 	unsigned int num_levels;
 	unsigned int num_leaves;
 	bool cpu_map_populated;
+	bool early_ci_levels;
 };
 
 struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
+int early_cache_level(unsigned int cpu);
 int init_cache_level(unsigned int cpu);
 int init_of_cache_level(unsigned int cpu);
 int populate_cache_leaves(unsigned int cpu);