diff mbox

[v3,4/4] nmi_backtrace: generate one-line reports for idle cpus

Message ID 20160322224557.GG6356@twins.programming.kicks-ass.net (mailing list archive)
State New, archived
Headers show

Commit Message

Peter Zijlstra March 22, 2016, 10:45 p.m. UTC
On Tue, Mar 22, 2016 at 11:31:11PM +0100, Rafael J. Wysocki wrote:

> > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> > index cd4510a63375..924554f920fb 100644
> > --- a/drivers/idle/intel_idle.c
> > +++ b/drivers/idle/intel_idle.c
> > @@ -725,8 +725,8 @@ static struct cpuidle_state avn_cstates[] = {
> >   *
> >   * Must be called under local_irq_disable().
> >   */
> > -static int intel_idle(struct cpuidle_device *dev,
> > -		struct cpuidle_driver *drv, int index)
> > +static __cpuidle int intel_idle(struct cpuidle_device *dev,
> > +				struct cpuidle_driver *drv, int index)
> >  {
> >  	unsigned long ecx = 1; /* break on interrupt flag */
> >  	struct cpuidle_state *state = &drv->states[index];
> 
> Well, what about intel_idle_freeze()?  Or do we not care?

I argued against it; when you're suspended the NMI watchdog is stopped
too. Then again, you've more experience debugging that thing, so if
you think its useful its not much effort adding it.

> And analogous stuff in processor_idle.c for that matter?
> 
> acpi_idle_enter()/acpi_idle_enter_freeze() plus stuff called by those?

Ah, I only tagged acpi_processor_ffh_cstate_enter() because I went from
mwait_idle_with_hints(), I suppose acpi_safe_halt() and
acpi_idle_do_entry() itself for the INB method should cover it?

(This being one of the reasons I asked Chris to Cc you; you know this
stuff far better than I do)


---
 drivers/acpi/processor_idle.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Rafael J. Wysocki March 23, 2016, 12:50 a.m. UTC | #1
On Tuesday, March 22, 2016 11:45:57 PM Peter Zijlstra wrote:
> On Tue, Mar 22, 2016 at 11:31:11PM +0100, Rafael J. Wysocki wrote:
> 
> > > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> > > index cd4510a63375..924554f920fb 100644
> > > --- a/drivers/idle/intel_idle.c
> > > +++ b/drivers/idle/intel_idle.c
> > > @@ -725,8 +725,8 @@ static struct cpuidle_state avn_cstates[] = {
> > >   *
> > >   * Must be called under local_irq_disable().
> > >   */
> > > -static int intel_idle(struct cpuidle_device *dev,
> > > -		struct cpuidle_driver *drv, int index)
> > > +static __cpuidle int intel_idle(struct cpuidle_device *dev,
> > > +				struct cpuidle_driver *drv, int index)
> > >  {
> > >  	unsigned long ecx = 1; /* break on interrupt flag */
> > >  	struct cpuidle_state *state = &drv->states[index];
> > 
> > Well, what about intel_idle_freeze()?  Or do we not care?
> 
> I argued against it; when you're suspended the NMI watchdog is stopped
> too.

Is it also stopped for suspend-to-idle?  I'm not sure about that.

Where do I need to look to find out?

> Then again, you've more experience debugging that thing, so if
> you think its useful its not much effort adding it.
> 
> > And analogous stuff in processor_idle.c for that matter?
> > 
> > acpi_idle_enter()/acpi_idle_enter_freeze() plus stuff called by those?
> 
> Ah, I only tagged acpi_processor_ffh_cstate_enter() because I went from
> mwait_idle_with_hints(), I suppose acpi_safe_halt() and
> acpi_idle_do_entry() itself for the INB method should cover it?

Yes, these two should be sufficient.

> (This being one of the reasons I asked Chris to Cc you; you know this
> stuff far better than I do)
> 
> ---
>  drivers/acpi/processor_idle.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index 175c86bee3a9..d5b11fff9e88 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -111,7 +111,7 @@ static const struct dmi_system_id processor_power_dmi_table[] = {
>   * Callers should disable interrupts before the call and enable
>   * interrupts after return.
>   */
> -static void acpi_safe_halt(void)
> +__cpuidle static void acpi_safe_halt(void)
>  {
>  	if (!tif_need_resched()) {
>  		safe_halt();
> @@ -680,7 +680,7 @@ static int acpi_idle_bm_check(void)
>   *
>   * Caller disables interrupt before call and enables interrupt after return.
>   */
> -static void acpi_idle_do_entry(struct acpi_processor_cx *cx)
> +__cpuidle static void acpi_idle_do_entry(struct acpi_processor_cx *cx)
>  {
>  	if (cx->entry_method == ACPI_CSTATE_FFH) {
>  		/* Call into architectural FFH based C-state */
Peter Zijlstra March 23, 2016, 7:53 a.m. UTC | #2
On Wed, Mar 23, 2016 at 01:50:00AM +0100, Rafael J. Wysocki wrote:

> > > Well, what about intel_idle_freeze()?  Or do we not care?
> > 
> > I argued against it; when you're suspended the NMI watchdog is stopped
> > too.
> 
> Is it also stopped for suspend-to-idle?  I'm not sure about that.
> 
> Where do I need to look to find out?

Hmm I have memories of writing a patch to that effect when we were
starting with that suspend-to-idle stuff, because people didn't like
being woken up all the time.

But now that I look I cannot find it either..
Chris Metcalf March 30, 2016, 5:16 p.m. UTC | #3
From the version 1 cover letter:

  This patch series modifies the trigger_xxx_backtrace() NMI-based
  remote backtracing code to make it more flexible, and makes a few
  small improvements along the way.

  The motivation comes from the task isolation code, where there are
  scenarios where we want to be able to diagnose a case where some cpu
  is about to interrupt a task-isolated cpu.  It can be helpful to
  see both where the interrupting cpu is, and also an approximation
  of where the cpu that is being interrupted is.  The nmi_backtrace
  framework allows us to discover the stack of the interrupted cpu.

I've tested that the change works as desired on tile, and build-tested
x86, arm64, and arm.  For x86 and arm64 I confirmed that the generic
cpuidle stuff as well as the architecture-specific routines are in the
new cpuidle section.  For arm I just build-tested it and made sure the
generic cpuidle routines were in the new cpuidle section, but I didn't
attempt to tease apart the tangle of platform-specific idle routines
that arm has and tag them with __cpuidle.  That might be more usefully
done by someone with arm platform experience in a follow-up patch.

I have also pushed it up to kernel.org to pull if that's easier:

git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile.git nmi-backtrace

The change conflicts with Petr Mladek's NMI printk cleanup patches:

https://lkml.kernel.org/r/1459353210-20260-1-git-send-email-pmladek@suse.com

He has kindly offered to resolve the conflicts.

v4: Added some more __cpuidle functions (PeterZ, Rafael Wysocki)
    Rebased to kernel v4.6-rc1

v3: Various improvements to the set of __cpuidle functions;
    Add back in a missing section accidentally removed in modpost.c (PeterZ)
    https://lkml.kernel.org/r/1458667179-19630-1-git-send-email-cmetcalf@mellanox.com

v2: Switch to using __cpuidle tagging, switch S-O-B to Mellanox
    https://lkml.kernel.org/r/1458147733-29338-1-git-send-email-cmetcalf@mellanox.com

Chris Metcalf (4):
  nmi_backtrace: add more trigger_*_cpu_backtrace() methods
  nmi_backtrace: do a local dump_stack() instead of a self-NMI
  arch/tile: adopt the new nmi_backtrace framework
  nmi_backtrace: generate one-line reports for idle cpus

 arch/alpha/kernel/vmlinux.lds.S      |  1 +
 arch/arc/kernel/vmlinux.lds.S        |  1 +
 arch/arm/include/asm/irq.h           |  4 +-
 arch/arm/kernel/smp.c                | 13 +------
 arch/arm/kernel/vmlinux.lds.S        |  1 +
 arch/arm64/kernel/vmlinux.lds.S      |  1 +
 arch/arm64/mm/proc.S                 |  2 +
 arch/avr32/kernel/vmlinux.lds.S      |  1 +
 arch/blackfin/kernel/vmlinux.lds.S   |  1 +
 arch/c6x/kernel/vmlinux.lds.S        |  1 +
 arch/cris/kernel/vmlinux.lds.S       |  1 +
 arch/frv/kernel/vmlinux.lds.S        |  1 +
 arch/h8300/kernel/vmlinux.lds.S      |  1 +
 arch/hexagon/kernel/vmlinux.lds.S    |  1 +
 arch/ia64/kernel/vmlinux.lds.S       |  1 +
 arch/m32r/kernel/vmlinux.lds.S       |  1 +
 arch/m68k/kernel/vmlinux-nommu.lds   |  1 +
 arch/m68k/kernel/vmlinux-std.lds     |  1 +
 arch/m68k/kernel/vmlinux-sun3.lds    |  1 +
 arch/metag/kernel/vmlinux.lds.S      |  1 +
 arch/microblaze/kernel/vmlinux.lds.S |  1 +
 arch/mips/kernel/vmlinux.lds.S       |  1 +
 arch/mn10300/kernel/vmlinux.lds.S    |  1 +
 arch/nios2/kernel/vmlinux.lds.S      |  1 +
 arch/openrisc/kernel/vmlinux.lds.S   |  1 +
 arch/parisc/kernel/vmlinux.lds.S     |  1 +
 arch/powerpc/kernel/vmlinux.lds.S    |  1 +
 arch/s390/kernel/vmlinux.lds.S       |  1 +
 arch/score/kernel/vmlinux.lds.S      |  1 +
 arch/sh/kernel/vmlinux.lds.S         |  1 +
 arch/sparc/kernel/vmlinux.lds.S      |  1 +
 arch/tile/include/asm/irq.h          |  4 +-
 arch/tile/kernel/entry.S             |  2 +-
 arch/tile/kernel/pmc.c               |  3 --
 arch/tile/kernel/process.c           | 72 ++++++++----------------------------
 arch/tile/kernel/traps.c             |  7 +++-
 arch/tile/kernel/vmlinux.lds.S       |  1 +
 arch/um/kernel/dyn.lds.S             |  1 +
 arch/um/kernel/uml.lds.S             |  1 +
 arch/unicore32/kernel/vmlinux.lds.S  |  1 +
 arch/x86/include/asm/irq.h           |  4 +-
 arch/x86/kernel/acpi/cstate.c        |  2 +-
 arch/x86/kernel/apic/hw_nmi.c        |  6 +--
 arch/x86/kernel/process.c            |  4 +-
 arch/x86/kernel/vmlinux.lds.S        |  1 +
 arch/xtensa/kernel/vmlinux.lds.S     |  3 ++
 drivers/acpi/processor_idle.c        |  5 ++-
 drivers/cpuidle/driver.c             |  5 ++-
 drivers/idle/intel_idle.c            |  4 +-
 include/asm-generic/vmlinux.lds.h    |  6 +++
 include/linux/cpu.h                  |  5 +++
 include/linux/nmi.h                  | 63 ++++++++++++++++++++++++-------
 kernel/sched/idle.c                  | 13 ++++++-
 lib/nmi_backtrace.c                  | 40 +++++++++++++-------
 scripts/mod/modpost.c                |  2 +-
 scripts/recordmcount.c               |  1 +
 scripts/recordmcount.pl              |  1 +
 57 files changed, 183 insertions(+), 121 deletions(-)
diff mbox

Patch

diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 175c86bee3a9..d5b11fff9e88 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -111,7 +111,7 @@  static const struct dmi_system_id processor_power_dmi_table[] = {
  * Callers should disable interrupts before the call and enable
  * interrupts after return.
  */
-static void acpi_safe_halt(void)
+__cpuidle static void acpi_safe_halt(void)
 {
 	if (!tif_need_resched()) {
 		safe_halt();
@@ -680,7 +680,7 @@  static int acpi_idle_bm_check(void)
  *
  * Caller disables interrupt before call and enables interrupt after return.
  */
-static void acpi_idle_do_entry(struct acpi_processor_cx *cx)
+__cpuidle static void acpi_idle_do_entry(struct acpi_processor_cx *cx)
 {
 	if (cx->entry_method == ACPI_CSTATE_FFH) {
 		/* Call into architectural FFH based C-state */