diff mbox

acpi: fix incompatibility with mcount-based function graph tracing

Message ID 20170324181254.gouyrbmppukrrbb6@treble (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Josh Poimboeuf March 24, 2017, 6:12 p.m. UTC
On Tue, Mar 21, 2017 at 09:44:03PM +0100, Paul Menzel wrote:
> I checked out Linux 4.9.16, applied your patch on top, and copied the Debian
> 4.9 Linux kernel configuration, did `make menuconfig`, disabled building
> debugging symbols, and executed `ARCH=i386 make -j40 deb-pkg`.
> 
> I installed that package on the Lenovo X60, and the result with tracing
> enabled has improved. The system suspends without a crash. Unfortunately,
> instead of resuming when pressing the power button, it starts from scratch.
> Suspend and resume without tracing enabled works though.
> 
> I’ll try to collect logs, but I don’t know, if there will be any, if the
> system just resets.
> 
> Maybe, this can be reproduced in QEMU?

So I was able to recreate this issue in qemu, and after some hours of
debugging I managed to figure it out.

It's rebooting during the resume because of a triple fault in
prepare_ftrace_return().

acpi wakeup for secondary cpu
  startup_32_smp()
    load_ucode_ap()
      prepare_ftrace_return()
        ftrace_graph_is_dead()
	  dereferences virtual address (kill_ftrace_graph) in real mode <-- BOOM

I tried fixing it by changing load_ucode_ap() to notrace, but that
function calls some other functions which also have mcount hooks, which
call other functions, etc.

Instead I was able to "fix" it by ignoring ftrace calls in real mode:

-----
index 8f3d9cf..5c0d0c6 100644
I'm not sure what the best fix should really be.  A few ideas off the
top of my head:

- A real mode check similar to the above (except it should probably be
  more precise)

- Make tracing_graph_pause a percpu variable so that it can be read from
  prepare_ftrace_return()

- pause_graph_tracing() from ftrace_suspend_notifier_call()

Steven, thoughts?

Comments

Steven Rostedt March 24, 2017, 6:41 p.m. UTC | #1
On Fri, 24 Mar 2017 13:12:54 -0500
Josh Poimboeuf <jpoimboe@redhat.com> wrote:


> Instead I was able to "fix" it by ignoring ftrace calls in real mode:
> 
> -----
> index 8f3d9cf..5c0d0c6 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -983,6 +983,9 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
>  	unsigned long return_hooker = (unsigned long)
>  				&return_to_handler;
>  
> +	if (__builtin_return_address(0) < TASK_SIZE_MAX)
> +		return;
> +
>  	if (unlikely(ftrace_graph_is_dead()))
>  		return;
> ---------------
> 
> I'm not sure what the best fix should really be.  A few ideas off the
> top of my head:
> 
> - A real mode check similar to the above (except it should probably be
>   more precise)

The real mode check hack may be good enough for now. Make sure that
it's commented well.

-- Steve

> 
> - Make tracing_graph_pause a percpu variable so that it can be read from
>   prepare_ftrace_return()
> 
> - pause_graph_tracing() from ftrace_suspend_notifier_call()
> 
> Steven, thoughts?
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki March 25, 2017, 1:20 p.m. UTC | #2
On Friday, March 24, 2017 02:41:14 PM Steven Rostedt wrote:
> On Fri, 24 Mar 2017 13:12:54 -0500
> Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> 
> > Instead I was able to "fix" it by ignoring ftrace calls in real mode:
> > 
> > -----
> > index 8f3d9cf..5c0d0c6 100644
> > --- a/arch/x86/kernel/ftrace.c
> > +++ b/arch/x86/kernel/ftrace.c
> > @@ -983,6 +983,9 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
> >  	unsigned long return_hooker = (unsigned long)
> >  				&return_to_handler;
> >  
> > +	if (__builtin_return_address(0) < TASK_SIZE_MAX)
> > +		return;
> > +
> >  	if (unlikely(ftrace_graph_is_dead()))
> >  		return;
> > ---------------
> > 
> > I'm not sure what the best fix should really be.  A few ideas off the
> > top of my head:
> > 
> > - A real mode check similar to the above (except it should probably be
> >   more precise)
> 
> The real mode check hack may be good enough for now. Make sure that
> it's commented well.

Agreed.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josh Poimboeuf March 27, 2017, 2:08 p.m. UTC | #3
On Sat, Mar 25, 2017 at 02:20:11PM +0100, Rafael J. Wysocki wrote:
> On Friday, March 24, 2017 02:41:14 PM Steven Rostedt wrote:
> > On Fri, 24 Mar 2017 13:12:54 -0500
> > Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > 
> > 
> > > Instead I was able to "fix" it by ignoring ftrace calls in real mode:
> > > 
> > > -----
> > > index 8f3d9cf..5c0d0c6 100644
> > > --- a/arch/x86/kernel/ftrace.c
> > > +++ b/arch/x86/kernel/ftrace.c
> > > @@ -983,6 +983,9 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
> > >  	unsigned long return_hooker = (unsigned long)
> > >  				&return_to_handler;
> > >  
> > > +	if (__builtin_return_address(0) < TASK_SIZE_MAX)
> > > +		return;
> > > +
> > >  	if (unlikely(ftrace_graph_is_dead()))
> > >  		return;
> > > ---------------
> > > 
> > > I'm not sure what the best fix should really be.  A few ideas off the
> > > top of my head:
> > > 
> > > - A real mode check similar to the above (except it should probably be
> > >   more precise)
> > 
> > The real mode check hack may be good enough for now. Make sure that
> > it's commented well.
> 
> Agreed.

Just to clarify, there are two bugs related to function graph tracing
and suspend/resume.  The original patch in this thread (which removes
'-Os' from the acpi Makefile) is still needed.
Steven Rostedt March 27, 2017, 3:30 p.m. UTC | #4
On Mon, 27 Mar 2017 16:53:09 +0200
Arnd Bergmann <arnd@arndb.de> wrote:


> > Actually, I believe that "%zd" will work. It's made to work with size_t
> > which is long long on 32 and long on 64.  
> 
> size_t is always 'long', not 'long long'. We have %pad for dma_addr_t
> which may be 'long' or 'long long', but it is configuration dependent
> which one it is on 32-bit.

Ah your right. It was that it was defined as "int" on 32 and "long" on
64, and that caused problems with warnings when using "%d" when it was
defined as long.

> 
> We could probably introduce a %pts format string for timespec64
> and have that pretty-printed.

Hmm, probably don't want a %p as that suggests its a pointer, which it
should not be. Unless we pass in the address of the number.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann March 27, 2017, 3:35 p.m. UTC | #5
On Mon, Mar 27, 2017 at 5:30 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Mon, 27 Mar 2017 16:53:09 +0200
>> We could probably introduce a %pts format string for timespec64
>> and have that pretty-printed.
>
> Hmm, probably don't want a %p as that suggests its a pointer, which it
> should not be. Unless we pass in the address of the number.

The special format strings that the kernel defines all start with %p and
require passing by reference so we don't get a warning from gcc. We can't
just make up new format strings otherwise, but we can create new meaning
for special pointers as we do for struct resource and others.

       Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steven Rostedt March 27, 2017, 4:11 p.m. UTC | #6
On Mon, 27 Mar 2017 17:35:13 +0200
Arnd Bergmann <arnd@arndb.de> wrote:

> On Mon, Mar 27, 2017 at 5:30 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > On Mon, 27 Mar 2017 16:53:09 +0200  
> >> We could probably introduce a %pts format string for timespec64
> >> and have that pretty-printed.  
> >
> > Hmm, probably don't want a %p as that suggests its a pointer, which it
> > should not be. Unless we pass in the address of the number.  
> 
> The special format strings that the kernel defines all start with %p and
> require passing by reference so we don't get a warning from gcc. We can't
> just make up new format strings otherwise, but we can create new meaning
> for special pointers as we do for struct resource and others.
> 

That's fine, but we need to be careful when it comes to tracing.
Passing in the address of a structure in the ring buffer may be fine,
but we need to make sure that an address pointing to something other
than the ring buffer is forbidden.

I'll need to update libtraceevent to handle such cases too.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki March 27, 2017, 4:59 p.m. UTC | #7
On Monday, March 27, 2017 09:08:43 AM Josh Poimboeuf wrote:
> On Sat, Mar 25, 2017 at 02:20:11PM +0100, Rafael J. Wysocki wrote:
> > On Friday, March 24, 2017 02:41:14 PM Steven Rostedt wrote:
> > > On Fri, 24 Mar 2017 13:12:54 -0500
> > > Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > > 
> > > 
> > > > Instead I was able to "fix" it by ignoring ftrace calls in real mode:
> > > > 
> > > > -----
> > > > index 8f3d9cf..5c0d0c6 100644
> > > > --- a/arch/x86/kernel/ftrace.c
> > > > +++ b/arch/x86/kernel/ftrace.c
> > > > @@ -983,6 +983,9 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
> > > >  	unsigned long return_hooker = (unsigned long)
> > > >  				&return_to_handler;
> > > >  
> > > > +	if (__builtin_return_address(0) < TASK_SIZE_MAX)
> > > > +		return;
> > > > +
> > > >  	if (unlikely(ftrace_graph_is_dead()))
> > > >  		return;
> > > > ---------------
> > > > 
> > > > I'm not sure what the best fix should really be.  A few ideas off the
> > > > top of my head:
> > > > 
> > > > - A real mode check similar to the above (except it should probably be
> > > >   more precise)
> > > 
> > > The real mode check hack may be good enough for now. Make sure that
> > > it's commented well.
> > 
> > Agreed.
> 
> Just to clarify, there are two bugs related to function graph tracing
> and suspend/resume.  The original patch in this thread (which removes
> '-Os' from the acpi Makefile) is still needed.

OK

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -983,6 +983,9 @@  void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
 	unsigned long return_hooker = (unsigned long)
 				&return_to_handler;
 
+	if (__builtin_return_address(0) < TASK_SIZE_MAX)
+		return;
+
 	if (unlikely(ftrace_graph_is_dead()))
 		return;
---------------