diff mbox

[RFC,1/2] ARM: deprecate old APCS frame format

Message ID 1453741521-6878-2-git-send-email-jean-philippe.brucker@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jean-Philippe Brucker Jan. 25, 2016, 5:05 p.m. UTC
GCC 5 deprecated option -mapcs-frame, which generated frames following
the old ABI format. In preparation for a possible removal in future
versions of GCC, deprecate use of this flag in Linux as well.

Instead of completely removing the bits that depend on APCS frame, we
move them behind a DEPRECATED config option to smoothen the transition.

Neither AAPCS nor EABI guarantees any frame format, so only ARM_UNWIND
should now be used for stack traces and introspection. It is already
selected by most defconfigs.
Furthermore, frames currently generated by GCC are different for leaf
and non-leaf functions, which would make adaptation quite tricky.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 arch/arm/Kconfig.debug           |   16 ++++++++++++----
 arch/arm/Makefile                |    6 +++++-
 arch/arm/include/asm/ftrace.h    |    8 ++++----
 arch/arm/kernel/entry-ftrace.S   |    2 +-
 arch/arm/kernel/return_address.c |    4 ++--
 arch/arm/kernel/stacktrace.c     |   13 ++++++++++---
 arch/arm/lib/backtrace.S         |    2 +-
 arch/arm/net/bpf_jit_32.c        |    6 +++---
 8 files changed, 38 insertions(+), 19 deletions(-)

Comments

Russell King - ARM Linux Feb. 1, 2016, 5:59 p.m. UTC | #1
On Mon, Jan 25, 2016 at 05:05:20PM +0000, Jean-Philippe Brucker wrote:
> GCC 5 deprecated option -mapcs-frame, which generated frames following
> the old ABI format. In preparation for a possible removal in future
> versions of GCC, deprecate use of this flag in Linux as well.
> 
> Instead of completely removing the bits that depend on APCS frame, we
> move them behind a DEPRECATED config option to smoothen the transition.
> 
> Neither AAPCS nor EABI guarantees any frame format, so only ARM_UNWIND
> should now be used for stack traces and introspection. It is already
> selected by most defconfigs.
> Furthermore, frames currently generated by GCC are different for leaf
> and non-leaf functions, which would make adaptation quite tricky.

As I mentioned on the call last week, I'm not happy with deprecating
APCS frames.  The unwinder does not always work, and each time the
unwinder fails to work, it creates more work.

The latest scenario can be found in the "[PATCH 1/2] ARM: make
virt_to_idmap() return unsigned long" thread, where we have a
failure to unwind the first oops - unwinding stops for apparently
no reason after the first frame.

So, we have no idea how tasket_action() got called.

My current idea on that oops is... I have no idea, and the only way
to get more of an idea would be to request a rebuild with thumb
disabled and frame pointers enabled, and for the problem to be
reproduced.

This is the issue: APCS frame pointer based backtracing works every
time.  The unwinder is and always has been unreliable.

GCC people need to think about this, because if they persue, we're
going to end up having to tell people to rebuild their kernels with
older GCC versions just to get debug information out of them, unless
the quality of the unwinder and unwind information can be improved.

However, having experienced the "quality" of gdb over the last five
years in userspace, I'm really not hopeful that there is a solution
that works using the unwind information: I suspect there's a great
number of cases where the unwind information is basically wrong or
otherwise broken.

The problem with unwind based solutions is the only time you know
that the unwinding information is broken is when the code goes wrong
and the unwind information has to be used.  The nice thing about the
frame-pointer based solution is that it's correctness is an inherent
requirement for the program to work, so any problem with a frame-
pointer based solution shows up as a program crash.

This is probably the root cause why the unwind based backtracing
tends to fail.

Here's the oops in question:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0003000
[00000000] *pgd=80000800004003, *pmd=00000000
Internal error: Oops: a07 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 330 Comm: kworker/0:1 Not tainted 4.5.0-rc1-00001-g22a8511 #1
Hardware name: Keystone
Workqueue: ipv6_addrconf addrconf_dad_work
task: eba89500 ti: eba96000 task.ti: eba96000
PC is at tasklet_action+0xac/0x13c
LR is at _raw_spin_unlock_irqrestore+0x28/0x54
pc : [<c0026c48>]    lr : [<c0574634>]    psr: 40000133
sp : eba97d30  ip : eb400258  fp : c083aecc
r10: 0000000c  r9 : eba97dc8  r8 : 00000100
r7 : 00000008  r6 : eba96000  r5 : 00000003  r4 : c07e408c
r3 : eba97d00  r2 : eba97d00  r1 : 60000113  r0 : 00000000
Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb Segment kernel
Control: 30c5387d  Table: 2b9ba700  DAC: fffffffd
Process kworker/0:1 (pid: 330, stack limit = 0xeba96210)
Stack: (0xeba97d30 to 0xeba98000)
7d20:                                     000086dd ebbb9900 00000004 eba97d30
7d40: c07e4080 c08409c0 0000000a ffff8e53 c07e4100 04208060 ebbb9900 c07dce84
7d60: 00000000 0000004e 00000000 00000001 eba97dc8 eb838000 c082ad40 c0026778
7d80: c07dce84 c006754c c080f430 c07e5e84 f080400c eba97dc8 f0804000 f0805000
7da0: 00000001 c0009438 c04fc29c 20000013 ffffffff eba97dfc cc8da400 eaf54d00
...
[<c0026c48>] (tasklet_action) from [<c08409c0>] (irq_stat+0x0/0x100)
Code: 4000 e256 0020 0a00 (5004) e1a0

Obviously, tasklet_action was _not_ called by the irq_stat array!

What we can see from the stack is there was an exception frame at
0xeba97d6c, with the parent context LR = c0009438 and PC = c04fc29c.
Much more than that is just guesswork - but the point is, the kernel
_should_ be capable of giving a good backtrace.

If we can't get a good backtrace, it limits those who can diagnose
the case of the failure, which really isn't good.

What we _might_ end up having to do is to walk every word on the
kernel stack, check whether it's in a module .text or kernel .text,
and print it as a "best guess" at a backtrace, but that will be
very misleading, as registers contain pointers to literal data,
which can be located not only at the end of a function, but also
the middle of a function, and that can end up on the stack too.
Jean-Philippe Brucker Feb. 4, 2016, 12:28 p.m. UTC | #2
Russell,

On Mon, Feb 01, 2016 at 05:59:07PM +0000, Russell King - ARM Linux wrote:
> On Mon, Jan 25, 2016 at 05:05:20PM +0000, Jean-Philippe Brucker wrote:
> > GCC 5 deprecated option -mapcs-frame, which generated frames following
> > the old ABI format. In preparation for a possible removal in future
> > versions of GCC, deprecate use of this flag in Linux as well.
> > 
> > Instead of completely removing the bits that depend on APCS frame, we
> > move them behind a DEPRECATED config option to smoothen the transition.
> > 
> > Neither AAPCS nor EABI guarantees any frame format, so only ARM_UNWIND
> > should now be used for stack traces and introspection. It is already
> > selected by most defconfigs.
> > Furthermore, frames currently generated by GCC are different for leaf
> > and non-leaf functions, which would make adaptation quite tricky.
> 
> As I mentioned on the call last week, I'm not happy with deprecating
> APCS frames.  The unwinder does not always work, and each time the
> unwinder fails to work, it creates more work.
> 
> The latest scenario can be found in the "[PATCH 1/2] ARM: make
> virt_to_idmap() return unsigned long" thread, where we have a
> failure to unwind the first oops - unwinding stops for apparently
> no reason after the first frame.

I spent some time trying to break the unwinder this week. The only
possibility I found so far is that the oops is triggered before the
function stack is stable.
I don't think it happens in your example, since lr couldn't be on
spin_unlock, but I will still describe that case because it is a bit
worrying.

My GCC 4.9 generates the following prologue for driver_unregister in a
Thumb2 kernel:

    801fd194 <driver_unregister>
    cbz     r0, 801fd1b0        <driver_unregister+0x1c>
    ldr     r3, [r0, #60]
    cbz     r3, 801fd1b0        <driver_unregister+0x1c>
    push    {r4, lr}

If the second instruction faults, we won't get a proper trace, because
the unwinder will find the following entry in the unwind table, and
assume r4 and lr are already on the stack:

    $ readelf -u vmlinux
    ...
    0x801fd194 <driver_unregister>: 0x80a8b0b0
    Compact model 0
    0xa8      pop {r4, r14}
    0xb0      finish
    ...

I might have missed something in the ABI, but if there's no possible way
to handle this case properly in unwind.c, we need to sort it out with
the GCC people.
I found similar constructions in kernels generated with GCC 4.6 and 5.1.

> So, we have no idea how tasket_action() got called.
> 
> My current idea on that oops is... I have no idea, and the only way
> to get more of an idea would be to request a rebuild with thumb
> disabled and frame pointers enabled, and for the problem to be
> reproduced.
> 
> This is the issue: APCS frame pointer based backtracing works every
> time.  The unwinder is and always has been unreliable.
> 
> GCC people need to think about this, because if they persue, we're
> going to end up having to tell people to rebuild their kernels with
> older GCC versions just to get debug information out of them, unless
> the quality of the unwinder and unwind information can be improved.
> 
> However, having experienced the "quality" of gdb over the last five
> years in userspace, I'm really not hopeful that there is a solution
> that works using the unwind information: I suspect there's a great
> number of cases where the unwind information is basically wrong or
> otherwise broken.
> 
> The problem with unwind based solutions is the only time you know
> that the unwinding information is broken is when the code goes wrong
> and the unwind information has to be used.  The nice thing about the
> frame-pointer based solution is that it's correctness is an inherent
> requirement for the program to work, so any problem with a frame-
> pointer based solution shows up as a program crash.
> 
> This is probably the root cause why the unwind based backtracing
> tends to fail.
> 
[...]
> 
> If we can't get a good backtrace, it limits those who can diagnose
> the case of the failure, which really isn't good.
> 

The current unwinder code also doesn't handle recursive calls (e.g.
of_platform_bus_create). There is room for improvement, and we probably
need to come up with a good self-test infrastructure before that
EXPERIMENTAL qualifier can be removed.
I agree that unwinding is not ready, which is one of the main reasons we
can't get rid of the -mapcs flag quite yet. However, I believe most
kernels out there already use ARM_UNWIND by default.

The other reason we can't remove -mapcs is that FUNCTION_GRAPH_TRACER
depends solely on the APCS frame at the moment, and is not easy to
change.
During my tests, I hacked function_graph to work with unwinding info.
It is a terrible solution because of the massive overhead it introduces
for each function call.
But the resulting graphs are the same as the ones generated using APCS
frames, which means that unwinding info are not too far from being
reliable.
I think we need to gather a comprehensive list of broken cases before we
can argue against unwinding to GCC people and ask for something more
reliable, but it does seem like we'll need a better solution for traces.

The problem is that -mapcs is not coming back as is, since it relies on
deprecated instructions. The motivation behind my series (apart from
starting this discussion) is to prevent people from introducing more
features that rely on it, while we work with GCC people to improve the
backtrace.

> What we _might_ end up having to do is to walk every word on the
> kernel stack, check whether it's in a module .text or kernel .text,
> and print it as a "best guess" at a backtrace, but that will be
> very misleading, as registers contain pointers to literal data,
> which can be located not only at the end of a function, but also
> the middle of a function, and that can end up on the stack too.

I really hope it won't come to that...

Thanks,
Jean-Philippe
diff mbox

Patch

diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
index 259c0ca..95b85ee 100644
--- a/arch/arm/Kconfig.debug
+++ b/arch/arm/Kconfig.debug
@@ -39,9 +39,17 @@  config FRAME_POINTER
 	default y if !ARM_UNWIND || FUNCTION_GRAPH_TRACER
 	help
 	  If you say N here, the resulting kernel will be slightly smaller and
-	  faster. However, if neither FRAME_POINTER nor ARM_UNWIND are enabled,
-	  when a problem occurs with the kernel, the information that is
-	  reported is severely limited.
+	  faster. When ARM_UNWIND is disabled, enabling FRAME_POINTER alone is
+	  not sufficient to get meaningful information in case of a kernel bug;
+	  you will also need DEPRECATED_APCS_FRAME.
+
+config DEPRECATED_APCS_FRAME
+	bool "Enable deprecated frame format"
+	depends on FRAME_POINTER
+	default n
+	help
+	  Use the legacy APCS frame format. This is required for ftrace with
+	  GCC < 4.4, as well as stack traces without ARM_UNWIND.
 
 config ARM_UNWIND
 	bool "Enable stack unwinding support (EXPERIMENTAL)"
@@ -56,7 +64,7 @@  config ARM_UNWIND
 
 config OLD_MCOUNT
 	bool
-	depends on FUNCTION_TRACER && FRAME_POINTER
+	depends on FUNCTION_TRACER && DEPRECATED_APCS_FRAME
 	default y
 
 config DEBUG_USER
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 2c2b28e..66a3adc 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -41,7 +41,11 @@  KBUILD_CFLAGS	+= $(call cc-option,-mno-unaligned-access)
 endif
 
 ifeq ($(CONFIG_FRAME_POINTER),y)
-KBUILD_CFLAGS	+=-fno-omit-frame-pointer -mapcs -mno-sched-prolog
+KBUILD_CFLAGS	+=-fno-omit-frame-pointer -mno-sched-prolog
+endif
+
+ifeq ($(CONFIG_DEPRECATED_APCS_FRAME),y)
+KBUILD_CFLAGS	+=-mapcs
 endif
 
 ifeq ($(CONFIG_CPU_BIG_ENDIAN),y)
diff --git a/arch/arm/include/asm/ftrace.h b/arch/arm/include/asm/ftrace.h
index bfe2a2f..a1ad19e 100644
--- a/arch/arm/include/asm/ftrace.h
+++ b/arch/arm/include/asm/ftrace.h
@@ -32,12 +32,12 @@  extern void ftrace_call_old(void);
 
 #ifndef __ASSEMBLY__
 
-#if defined(CONFIG_FRAME_POINTER) && !defined(CONFIG_ARM_UNWIND)
+#if defined(CONFIG_DEPRECATED_APCS_FRAME) && !defined(CONFIG_ARM_UNWIND)
 /*
  * return_address uses walk_stackframe to do it's work.  If both
- * CONFIG_FRAME_POINTER=y and CONFIG_ARM_UNWIND=y walk_stackframe uses unwind
- * information.  For this to work in the function tracer many functions would
- * have to be marked with __notrace.  So for now just depend on
+ * CONFIG_DEPRECATED_APCS_FRAME=y and CONFIG_ARM_UNWIND=y walk_stackframe uses
+ * unwind information.  For this to work in the function tracer many functions
+ * would have to be marked with __notrace.  So for now just depend on
  * !CONFIG_ARM_UNWIND.
  */
 
diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
index c73c403..b9984b5 100644
--- a/arch/arm/kernel/entry-ftrace.S
+++ b/arch/arm/kernel/entry-ftrace.S
@@ -53,7 +53,7 @@ 
 
 #ifndef CONFIG_OLD_MCOUNT
 #if (__GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 4))
-#error Ftrace requires CONFIG_FRAME_POINTER=y with GCC older than 4.4.0.
+#error Ftrace requires CONFIG_DEPRECATED_APCS_FRAME=y with GCC older than 4.4.0.
 #endif
 #endif
 
diff --git a/arch/arm/kernel/return_address.c b/arch/arm/kernel/return_address.c
index 36ed350..3728026 100644
--- a/arch/arm/kernel/return_address.c
+++ b/arch/arm/kernel/return_address.c
@@ -11,7 +11,7 @@ 
 #include <linux/export.h>
 #include <linux/ftrace.h>
 
-#if defined(CONFIG_FRAME_POINTER) && !defined(CONFIG_ARM_UNWIND)
+#if defined(CONFIG_DEPRECATED_APCS_FRAME) && !defined(CONFIG_ARM_UNWIND)
 #include <linux/sched.h>
 
 #include <asm/stacktrace.h>
@@ -56,6 +56,6 @@  void *return_address(unsigned int level)
 		return NULL;
 }
 
-#endif /* if defined(CONFIG_FRAME_POINTER) && !defined(CONFIG_ARM_UNWIND) */
+#endif /* if defined(CONFIG_DEPRECATED_APCS_FRAME) && !defined(CONFIG_ARM_UNWIND) */
 
 EXPORT_SYMBOL_GPL(return_address);
diff --git a/arch/arm/kernel/stacktrace.c b/arch/arm/kernel/stacktrace.c
index 92b7237..20e9a7f 100644
--- a/arch/arm/kernel/stacktrace.c
+++ b/arch/arm/kernel/stacktrace.c
@@ -5,13 +5,14 @@ 
 #include <asm/stacktrace.h>
 #include <asm/traps.h>
 
-#if defined(CONFIG_FRAME_POINTER) && !defined(CONFIG_ARM_UNWIND)
+#ifndef CONFIG_ARM_UNWIND
+#ifdef CONFIG_DEPRECATED_APCS_FRAME
 /*
  * Unwind the current stack frame and store the new register values in the
  * structure passed as argument. Unwinding is equivalent to a function return,
  * hence the new PC value rather than LR should be used for backtrace.
  *
- * With framepointer enabled, a simple function prologue looks like this:
+ * With APCS frames enabled, a simple function prologue looks like this:
  *	mov	ip, sp
  *	stmdb	sp!, {fp, ip, lr, pc}
  *	sub	fp, ip, #4
@@ -19,7 +20,7 @@ 
  * A simple function epilogue looks like this:
  *	ldm	sp, {fp, sp, pc}
  *
- * Note that with framepointer enabled, even the leaf functions have the same
+ * Note that with APCS frames enabled, even the leaf functions have the same
  * prologue and epilogue, therefore we can ignore the LR value in this case.
  */
 int notrace unwind_frame(struct stackframe *frame)
@@ -42,7 +43,13 @@  int notrace unwind_frame(struct stackframe *frame)
 
 	return 0;
 }
+#else /* !defined(CONFIG_DEPRECATED_APCS_FRAME) */
+int notrace unwind_frame(struct stackframe *frame)
+{
+	return -EINVAL;
+}
 #endif
+#endif /* !defined(CONFIG_ARM_UNWIND) */
 
 void notrace walk_stackframe(struct stackframe *frame,
 		     int (*fn)(struct stackframe *, void *), void *data)
diff --git a/arch/arm/lib/backtrace.S b/arch/arm/lib/backtrace.S
index fab5a50..3041ccb 100644
--- a/arch/arm/lib/backtrace.S
+++ b/arch/arm/lib/backtrace.S
@@ -24,7 +24,7 @@ 
 
 ENTRY(c_backtrace)
 
-#if !defined(CONFIG_FRAME_POINTER) || !defined(CONFIG_PRINTK)
+#if !defined(CONFIG_DEPRECATED_APCS_FRAME) || !defined(CONFIG_PRINTK)
 		ret	lr
 ENDPROC(c_backtrace)
 #else
diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 93d0b6d..8f7b718 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -165,7 +165,7 @@  static u16 saved_regs(struct jit_ctx *ctx)
 	    (ctx->skf->insns[0].code == (BPF_RET | BPF_A)))
 		ret |= 1 << r_A;
 
-#ifdef CONFIG_FRAME_POINTER
+#ifdef CONFIG_DEPRECATED_APCS_FRAME
 	ret |= (1 << ARM_FP) | (1 << ARM_IP) | (1 << ARM_LR) | (1 << ARM_PC);
 #else
 	if (ctx->seen & SEEN_CALL)
@@ -200,7 +200,7 @@  static void build_prologue(struct jit_ctx *ctx)
 	u16 reg_set = saved_regs(ctx);
 	u16 off;
 
-#ifdef CONFIG_FRAME_POINTER
+#ifdef CONFIG_DEPRECATED_APCS_FRAME
 	emit(ARM_MOV_R(ARM_IP, ARM_SP), ctx);
 	emit(ARM_PUSH(reg_set), ctx);
 	emit(ARM_SUB_I(ARM_FP, ARM_IP, 4), ctx);
@@ -244,7 +244,7 @@  static void build_epilogue(struct jit_ctx *ctx)
 
 	reg_set &= ~(1 << ARM_LR);
 
-#ifdef CONFIG_FRAME_POINTER
+#ifdef CONFIG_DEPRECATED_APCS_FRAME
 	/* the first instruction of the prologue was: mov ip, sp */
 	reg_set &= ~(1 << ARM_IP);
 	reg_set |= (1 << ARM_SP);