Message ID | 87ehzstr07.fsf@vostro.fn.ogness.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
John Ogness <john.ogness@linutronix.de> writes: > From 6f3f381800367127dc6430d9b9fa9bd6fc6d8ed0 Mon Sep 17 00:00:00 2001 > From: John Ogness <john.ogness@linutronix.de> > > The alignment exception handler is setup fairly late in > the boot process (fs_initcall). However, with newer gcc > versions and the compiler option -fconserve-stack, code > accessing unaligned data is generated in functions that > are called earlier, for example pcpu_dump_alloc_info(). This is a gcc bug and should not be worked around like this. There was another patch[1] to disable -fconserve-stack on ARM due to this broken behaviour. This patch was the outcome of a rather lengthy discussion[2]. [1] http://article.gmane.org/gmane.linux.kernel/1148272 [2] http://thread.gmane.org/gmane.linux.ports.arm.kernel/117863 > This results in unhandled alignment exceptions during > boot. By setting up the exception handler sooner, we > reduce the window where a compiler may generate code > accessing unaligned data. I will also restate my opinion that enabling strict alignment checking on ARMv6 and later is *wrong* in the first place.
On Wed, Sep 07, 2011 at 02:35:04PM +0100, John Ogness wrote: > From 6f3f381800367127dc6430d9b9fa9bd6fc6d8ed0 Mon Sep 17 00:00:00 2001 > From: John Ogness <john.ogness@linutronix.de> > > The alignment exception handler is setup fairly late in > the boot process (fs_initcall). However, with newer gcc > versions and the compiler option -fconserve-stack, code > accessing unaligned data is generated in functions that > are called earlier, for example pcpu_dump_alloc_info(). > This results in unhandled alignment exceptions during > boot. By setting up the exception handler sooner, we > reduce the window where a compiler may generate code > accessing unaligned data. While this reduces the window and fixes this particular case, it still doesn't solve the problem. We never know when some unaligned access would be generated for some earlier code. I would just make sure that SCTLR.A bit is always cleared on ARMv6+, independent of CONFIG_ALIGNMENT_TRAP.
On Wed, Sep 07, 2011 at 05:42:19PM +0100, Måns Rullgård wrote: > There are such instructions (ldrd, ldm), but gcc will not emit those > unless the address is known to be aligned. For ARMv6 and later, gcc 4.6 > *will* emit potentially unaligned ldr and ldrh since these very clearly > allow an unaligned address and are faster than the alternatives in all > implementations to date. This is unless strict alignment checking is > explicitly enabled, which unfortunately the Linux kernel does for no > apparent reason at all. "no apparant reason at all" heh. The reason is to keep the code simple and free from bugs. To do otherwise means that each of the CPU files needs to be littered with ifdefs to deal with the alignment fault configuration, of which there are 16 of them (ignoring v6 and v7.) If you think code maintanence of the same thing in 16 places is efficient then I guess there is "no apparant reason". I beg to differ, being one of those folk who have had to edit 18 different places several times. So no, I do not intend to move this: #ifdef CONFIG_ALIGNMENT_TRAP orr r0, r0, #CR_A #else bic r0, r0, #CR_A #endif into 16 separate places in the kernel.
On Wed, Sep 07, 2011 at 06:04:52PM +0100, Russell King - ARM Linux wrote: > On Wed, Sep 07, 2011 at 05:42:19PM +0100, Måns Rullgård wrote: > > There are such instructions (ldrd, ldm), but gcc will not emit those > > unless the address is known to be aligned. For ARMv6 and later, gcc 4.6 > > *will* emit potentially unaligned ldr and ldrh since these very clearly > > allow an unaligned address and are faster than the alternatives in all > > implementations to date. This is unless strict alignment checking is > > explicitly enabled, which unfortunately the Linux kernel does for no > > apparent reason at all. > > "no apparant reason at all" heh. The reason is to keep the code > simple and free from bugs. To do otherwise means that each of the > CPU files needs to be littered with ifdefs to deal with the alignment > fault configuration, of which there are 16 of them (ignoring v6 and v7.) > > If you think code maintanence of the same thing in 16 places is efficient > then I guess there is "no apparant reason". I beg to differ, being one > of those folk who have had to edit 18 different places several times. > > So no, I do not intend to move this: > > #ifdef CONFIG_ALIGNMENT_TRAP > orr r0, r0, #CR_A > #else > bic r0, r0, #CR_A > #endif > > into 16 separate places in the kernel. What about something like this (untested): #if defined(CONFIG_ALIGNMENT_TRAP) && __LINUX_ARM_ARCH__ < 6 orr r0, r0, #CR_A #else bic r0, r0, #CR_A #endif
Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > On Wed, Sep 07, 2011 at 05:42:19PM +0100, Måns Rullgård wrote: >> There are such instructions (ldrd, ldm), but gcc will not emit those >> unless the address is known to be aligned. For ARMv6 and later, gcc 4.6 >> *will* emit potentially unaligned ldr and ldrh since these very clearly >> allow an unaligned address and are faster than the alternatives in all >> implementations to date. This is unless strict alignment checking is >> explicitly enabled, which unfortunately the Linux kernel does for no >> apparent reason at all. > > "no apparant reason at all" heh. The reason is to keep the code > simple and free from bugs. Some people, myself included, consider the current behaviour a bug. > To do otherwise means that each of the CPU files needs to be littered > with ifdefs to deal with the alignment fault configuration, of which > there are 16 of them (ignoring v6 and v7.) > > If you think code maintanence of the same thing in 16 places is efficient > then I guess there is "no apparant reason". I beg to differ, being one > of those folk who have had to edit 18 different places several times. > > So no, I do not intend to move this: > > #ifdef CONFIG_ALIGNMENT_TRAP > orr r0, r0, #CR_A > #else > bic r0, r0, #CR_A > #endif > > into 16 separate places in the kernel. So change that condition to also depend on !CPU_V6 && !CPU_V7 or something equivalent.
=========== begin align_test.c ========== extern void exfunc(char *str); void somefunc(void) { char mystr[] = "--------"; /* 9 bytes */ exfunc(mystr); } ============ end align_test.c =========== Using the cross toolchain: arm-none-linux-gnueabi-gcc (Sourcery G++ Lite 2011.03-41) 4.5.2 $ arm-none-linux-gnueabi-gcc \ -march=armv6 \ -mtune=arm1136j-s \ -marm \ -Os \ -fconserve-stack \ -c align_test.c The object dump of align_test.o shows: 00000000 <somefunc>: 0: e92d401f push {r0, r1, r2, r3, r4, lr} 4: e28d0007 add r0, sp, #7 8: e59f3020 ldr r3, [pc, #32] ; 30 <somefunc+0x30> c: e5932000 ldr r2, [r3] 10: e58d2007 str r2, [sp, #7] ... ... At address 0x10 there is unaligned access. Signed-off-by: John Ogness <john.ogness@linutronix.de> --- Patch against linux-next-20110831. arch/arm/include/asm/setup.h | 1 + arch/arm/kernel/setup.c | 2 ++ arch/arm/mm/alignment.c | 10 +++++++--- 3 files changed, 10 insertions(+), 3 deletions(-) diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h index a3ca303..b3e18f8 100644 --- a/arch/arm/include/asm/setup.h +++ b/arch/arm/include/asm/setup.h @@ -232,6 +232,7 @@ extern struct meminfo meminfo; extern int arm_add_memory(phys_addr_t start, unsigned long size); extern void early_print(const char *str, ...); extern void dump_machine_table(void); +extern int __init alignment_init(void); #endif /* __KERNEL__ */ diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 0ca06f7..0f1b2b5 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -950,6 +950,8 @@ void __init setup_arch(char **cmdline_p) #endif early_trap_init(); + alignment_init(); + if (mdesc->init_early) mdesc->init_early(); } diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c index 715eb1d..5f013cb 100644 --- a/arch/arm/mm/alignment.c +++ b/arch/arm/mm/alignment.c @@ -950,7 +950,7 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs) * it isn't a sysctl, and it doesn't contain sysctl information. * We now locate it in /proc/cpu/alignment instead. */ -static int __init alignment_init(void) +static int __init alignment_sysfs_init(void) { #ifdef CONFIG_PROC_FS struct proc_dir_entry *res; @@ -960,7 +960,13 @@ static int __init alignment_init(void) if (!res) return -ENOMEM; #endif + return 0; +} + +fs_initcall(alignment_sysfs_init); +int __init alignment_init(void) +{ if (cpu_is_v6_unaligned()) { cr_alignment &= ~CR_A; cr_no_alignment &= ~CR_A; @@ -985,5 +991,3 @@ static int __init alignment_init(void) return 0; } - -fs_initcall(alignment_init);
From 6f3f381800367127dc6430d9b9fa9bd6fc6d8ed0 Mon Sep 17 00:00:00 2001 From: John Ogness <john.ogness@linutronix.de> The alignment exception handler is setup fairly late in the boot process (fs_initcall). However, with newer gcc versions and the compiler option -fconserve-stack, code accessing unaligned data is generated in functions that are called earlier, for example pcpu_dump_alloc_info(). This results in unhandled alignment exceptions during boot. By setting up the exception handler sooner, we reduce the window where a compiler may generate code accessing unaligned data. Here is a minimal example that shows the issue seen with pcpu_dump_alloc_info():