diff mbox

ARM: alignment: setup alignment handler earlier

Message ID 87ehzstr07.fsf@vostro.fn.ogness.net (mailing list archive)
State New, archived
Headers show

Commit Message

John Ogness Sept. 7, 2011, 1:35 p.m. UTC
From 6f3f381800367127dc6430d9b9fa9bd6fc6d8ed0 Mon Sep 17 00:00:00 2001
From: John Ogness <john.ogness@linutronix.de>

The alignment exception handler is setup fairly late in
the boot process (fs_initcall). However, with newer gcc
versions and the compiler option -fconserve-stack, code
accessing unaligned data is generated in functions that
are called earlier, for example pcpu_dump_alloc_info().
This results in unhandled alignment exceptions during
boot. By setting up the exception handler sooner, we
reduce the window where a compiler may generate code
accessing unaligned data.

Here is a minimal example that shows the issue seen with
pcpu_dump_alloc_info():

Comments

Mans Rullgard Sept. 7, 2011, 1:54 p.m. UTC | #1
John Ogness <john.ogness@linutronix.de> writes:

> From 6f3f381800367127dc6430d9b9fa9bd6fc6d8ed0 Mon Sep 17 00:00:00 2001
> From: John Ogness <john.ogness@linutronix.de>
>
> The alignment exception handler is setup fairly late in
> the boot process (fs_initcall). However, with newer gcc
> versions and the compiler option -fconserve-stack, code
> accessing unaligned data is generated in functions that
> are called earlier, for example pcpu_dump_alloc_info().

This is a gcc bug and should not be worked around like this.  There was
another patch[1] to disable -fconserve-stack on ARM due to this broken
behaviour.  This patch was the outcome of a rather lengthy discussion[2].

[1] http://article.gmane.org/gmane.linux.kernel/1148272
[2] http://thread.gmane.org/gmane.linux.ports.arm.kernel/117863

> This results in unhandled alignment exceptions during
> boot. By setting up the exception handler sooner, we
> reduce the window where a compiler may generate code
> accessing unaligned data.

I will also restate my opinion that enabling strict alignment checking
on ARMv6 and later is *wrong* in the first place.
Catalin Marinas Sept. 7, 2011, 2:40 p.m. UTC | #2
On Wed, Sep 07, 2011 at 02:35:04PM +0100, John Ogness wrote:
> From 6f3f381800367127dc6430d9b9fa9bd6fc6d8ed0 Mon Sep 17 00:00:00 2001
> From: John Ogness <john.ogness@linutronix.de>
> 
> The alignment exception handler is setup fairly late in
> the boot process (fs_initcall). However, with newer gcc
> versions and the compiler option -fconserve-stack, code
> accessing unaligned data is generated in functions that
> are called earlier, for example pcpu_dump_alloc_info().
> This results in unhandled alignment exceptions during
> boot. By setting up the exception handler sooner, we
> reduce the window where a compiler may generate code
> accessing unaligned data.

While this reduces the window and fixes this particular case, it still
doesn't solve the problem. We never know when some unaligned access
would be generated for some earlier code.

I would just make sure that SCTLR.A bit is always cleared on ARMv6+,
independent of CONFIG_ALIGNMENT_TRAP.
Russell King - ARM Linux Sept. 7, 2011, 5:04 p.m. UTC | #3
On Wed, Sep 07, 2011 at 05:42:19PM +0100, Måns Rullgård wrote:
> There are such instructions (ldrd, ldm), but gcc will not emit those
> unless the address is known to be aligned.  For ARMv6 and later, gcc 4.6
> *will* emit potentially unaligned ldr and ldrh since these very clearly
> allow an unaligned address and are faster than the alternatives in all
> implementations to date.  This is unless strict alignment checking is
> explicitly enabled, which unfortunately the Linux kernel does for no
> apparent reason at all.

"no apparant reason at all" heh.  The reason is to keep the code
simple and free from bugs.  To do otherwise means that each of the
CPU files needs to be littered with ifdefs to deal with the alignment
fault configuration, of which there are 16 of them (ignoring v6 and v7.)

If you think code maintanence of the same thing in 16 places is efficient
then I guess there is "no apparant reason".  I beg to differ, being one
of those folk who have had to edit 18 different places several times.

So no, I do not intend to move this:

#ifdef CONFIG_ALIGNMENT_TRAP
        orr     r0, r0, #CR_A
#else
        bic     r0, r0, #CR_A
#endif

into 16 separate places in the kernel.
Catalin Marinas Sept. 7, 2011, 5:24 p.m. UTC | #4
On Wed, Sep 07, 2011 at 06:04:52PM +0100, Russell King - ARM Linux wrote:
> On Wed, Sep 07, 2011 at 05:42:19PM +0100, Måns Rullgård wrote:
> > There are such instructions (ldrd, ldm), but gcc will not emit those
> > unless the address is known to be aligned.  For ARMv6 and later, gcc 4.6
> > *will* emit potentially unaligned ldr and ldrh since these very clearly
> > allow an unaligned address and are faster than the alternatives in all
> > implementations to date.  This is unless strict alignment checking is
> > explicitly enabled, which unfortunately the Linux kernel does for no
> > apparent reason at all.
> 
> "no apparant reason at all" heh.  The reason is to keep the code
> simple and free from bugs.  To do otherwise means that each of the
> CPU files needs to be littered with ifdefs to deal with the alignment
> fault configuration, of which there are 16 of them (ignoring v6 and v7.)
> 
> If you think code maintanence of the same thing in 16 places is efficient
> then I guess there is "no apparant reason".  I beg to differ, being one
> of those folk who have had to edit 18 different places several times.
> 
> So no, I do not intend to move this:
> 
> #ifdef CONFIG_ALIGNMENT_TRAP
>         orr     r0, r0, #CR_A
> #else
>         bic     r0, r0, #CR_A
> #endif
> 
> into 16 separate places in the kernel.

What about something like this (untested):

#if defined(CONFIG_ALIGNMENT_TRAP) && __LINUX_ARM_ARCH__ < 6
	orr	r0, r0, #CR_A
#else
	bic	r0, r0, #CR_A
#endif
Mans Rullgard Sept. 7, 2011, 5:40 p.m. UTC | #5
Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Wed, Sep 07, 2011 at 05:42:19PM +0100, Måns Rullgård wrote:
>> There are such instructions (ldrd, ldm), but gcc will not emit those
>> unless the address is known to be aligned.  For ARMv6 and later, gcc 4.6
>> *will* emit potentially unaligned ldr and ldrh since these very clearly
>> allow an unaligned address and are faster than the alternatives in all
>> implementations to date.  This is unless strict alignment checking is
>> explicitly enabled, which unfortunately the Linux kernel does for no
>> apparent reason at all.
>
> "no apparant reason at all" heh.  The reason is to keep the code
> simple and free from bugs.

Some people, myself included, consider the current behaviour a bug.

> To do otherwise means that each of the CPU files needs to be littered
> with ifdefs to deal with the alignment fault configuration, of which
> there are 16 of them (ignoring v6 and v7.)
>
> If you think code maintanence of the same thing in 16 places is efficient
> then I guess there is "no apparant reason".  I beg to differ, being one
> of those folk who have had to edit 18 different places several times.
>
> So no, I do not intend to move this:
>
> #ifdef CONFIG_ALIGNMENT_TRAP
>         orr     r0, r0, #CR_A
> #else
>         bic     r0, r0, #CR_A
> #endif
>
> into 16 separate places in the kernel.

So change that condition to also depend on !CPU_V6 && !CPU_V7 or
something equivalent.
diff mbox

Patch

=========== begin align_test.c ==========
extern void exfunc(char *str);
void somefunc(void)
{
        char mystr[] = "--------";  /* 9 bytes */
        exfunc(mystr);
}
============ end align_test.c ===========

Using the cross toolchain:
arm-none-linux-gnueabi-gcc (Sourcery G++ Lite 2011.03-41) 4.5.2

$ arm-none-linux-gnueabi-gcc \
     -march=armv6 \
     -mtune=arm1136j-s \
     -marm \
     -Os \
     -fconserve-stack \
     -c align_test.c

The object dump of align_test.o shows:

00000000 <somefunc>:
   0:   e92d401f        push    {r0, r1, r2, r3, r4, lr}
   4:   e28d0007        add     r0, sp, #7
   8:   e59f3020        ldr     r3, [pc, #32]   ; 30 <somefunc+0x30>
   c:   e5932000        ldr     r2, [r3]
  10:   e58d2007        str     r2, [sp, #7]
  ...
  ...

At address 0x10 there is unaligned access.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
Patch against linux-next-20110831.
 arch/arm/include/asm/setup.h |    1 +
 arch/arm/kernel/setup.c      |    2 ++
 arch/arm/mm/alignment.c      |   10 +++++++---
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h
index a3ca303..b3e18f8 100644
--- a/arch/arm/include/asm/setup.h
+++ b/arch/arm/include/asm/setup.h
@@ -232,6 +232,7 @@  extern struct meminfo meminfo;
 extern int arm_add_memory(phys_addr_t start, unsigned long size);
 extern void early_print(const char *str, ...);
 extern void dump_machine_table(void);
+extern int __init alignment_init(void);
 
 #endif  /*  __KERNEL__  */
 
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 0ca06f7..0f1b2b5 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -950,6 +950,8 @@  void __init setup_arch(char **cmdline_p)
 #endif
 	early_trap_init();
 
+	alignment_init();
+
 	if (mdesc->init_early)
 		mdesc->init_early();
 }
diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index 715eb1d..5f013cb 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -950,7 +950,7 @@  do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
  * it isn't a sysctl, and it doesn't contain sysctl information.
  * We now locate it in /proc/cpu/alignment instead.
  */
-static int __init alignment_init(void)
+static int __init alignment_sysfs_init(void)
 {
 #ifdef CONFIG_PROC_FS
 	struct proc_dir_entry *res;
@@ -960,7 +960,13 @@  static int __init alignment_init(void)
 	if (!res)
 		return -ENOMEM;
 #endif
+	return 0;
+}
+
+fs_initcall(alignment_sysfs_init);
 
+int __init alignment_init(void)
+{
 	if (cpu_is_v6_unaligned()) {
 		cr_alignment &= ~CR_A;
 		cr_no_alignment &= ~CR_A;
@@ -985,5 +991,3 @@  static int __init alignment_init(void)
 
 	return 0;
 }
-
-fs_initcall(alignment_init);