diff mbox

ARM: decompressor: clear SCTLR.A bit for v7 cores

Message ID 1349959402-24164-1-git-send-email-robherring2@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Rob Herring Oct. 11, 2012, 12:43 p.m. UTC
From: Rob Herring <rob.herring@calxeda.com>

With recent compilers and move to generic unaligned.h in commit d25c881
(ARM: 7493/1: use generic unaligned.h), unaligned accesses will be used
by the LZO decompressor on v7 cores. So we need to make sure unaligned
accesses are allowed by clearing the SCTLR A bit.

While v6 can support unaligned accesses, it is optional and current
compilers won't emit unaligned accesses. So we don't clear the A bit for
v6.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
---

The contents of this were already reviewed on this thread, so I sent this
to the patch system and this was Russell's reply:

> NAK for two reasons.
> 
> 1. It hasn't been on the list (I can't find a match for "clear SCTLR.A"
> in my mailbox)
> 
> 2. The behaviour of unaligned accesses vary depending on CPU.  Some
> fix-up the access, others load the word and then rotate it.  If we have
> decompressors which perform unaligned accesses, we need to fix this
> properly to avoid the CPU specific behaviour, rather than tweaking
> control bits to hide the problem.

I'm simply matching the behavior of the kernel itself. The A bit is cleared
for v7 kernels and compilers only generate unaligned accesses for v7.
Without this the initial state of the A bit is undefined as a bootloader
could have cleared it already. We should document the required state or set
it to what we want.

Rob

 arch/arm/boot/compressed/head.S |    1 +
 1 file changed, 1 insertion(+)

Comments

Russell King - ARM Linux Oct. 11, 2012, 1:09 p.m. UTC | #1
On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> The contents of this were already reviewed on this thread, so I sent this
> to the patch system and this was Russell's reply:

So that's why I couldn't find it - the mailing list thread has a different
subject line to the patch.  Don't do that.  Given the amount of list
traffic we have today, that's as good as not having been posted at all.

> > NAK for two reasons.
> > 
> > 1. It hasn't been on the list (I can't find a match for "clear SCTLR.A"
> > in my mailbox)
> > 
> > 2. The behaviour of unaligned accesses vary depending on CPU.  Some
> > fix-up the access, others load the word and then rotate it.  If we have
> > decompressors which perform unaligned accesses, we need to fix this
> > properly to avoid the CPU specific behaviour, rather than tweaking
> > control bits to hide the problem.
> 
> I'm simply matching the behavior of the kernel itself. The A bit is cleared
> for v7 kernels and compilers only generate unaligned accesses for v7.
> Without this the initial state of the A bit is undefined as a bootloader
> could have cleared it already. We should document the required state or set
> it to what we want.

Irrespective of this, (2) still stands.  Unaligned accesses in the
decompressor without a fixup (which will be very hard to provide)
will return different data depending on the CPU as I mention in point
2.

So, using unaligned accesses in the decompressor may not give expected
results in all cases, so they're best avoided.

And if they're avoided, then we don't care about the setting of the A
bit.
Rob Herring Oct. 11, 2012, 1:31 p.m. UTC | #2
On 10/11/2012 08:09 AM, Russell King - ARM Linux wrote:
> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
>> The contents of this were already reviewed on this thread, so I sent this
>> to the patch system and this was Russell's reply:
> 
> So that's why I couldn't find it - the mailing list thread has a different
> subject line to the patch.  Don't do that.  Given the amount of list
> traffic we have today, that's as good as not having been posted at all.
> 
>>> NAK for two reasons.
>>>
>>> 1. It hasn't been on the list (I can't find a match for "clear SCTLR.A"
>>> in my mailbox)
>>>
>>> 2. The behaviour of unaligned accesses vary depending on CPU.  Some
>>> fix-up the access, others load the word and then rotate it.  If we have
>>> decompressors which perform unaligned accesses, we need to fix this
>>> properly to avoid the CPU specific behaviour, rather than tweaking
>>> control bits to hide the problem.
>>
>> I'm simply matching the behavior of the kernel itself. The A bit is cleared
>> for v7 kernels and compilers only generate unaligned accesses for v7.
>> Without this the initial state of the A bit is undefined as a bootloader
>> could have cleared it already. We should document the required state or set
>> it to what we want.
> 
> Irrespective of this, (2) still stands.  Unaligned accesses in the
> decompressor without a fixup (which will be very hard to provide)
> will return different data depending on the CPU as I mention in point
> 2.

This only affects v7 cores. It should not vary for v7 cores as unaligned
access is a required feature. So how is it going to vary on v7 CPUs?
We've got bigger problems if there are v7 cores that don't handle
unaligned accesses.

Rob

> So, using unaligned accesses in the decompressor may not give expected
> results in all cases, so they're best avoided.
>
> And if they're avoided, then we don't care about the setting of the A
> bit.
>
Russell King - ARM Linux Oct. 11, 2012, 1:41 p.m. UTC | #3
On Thu, Oct 11, 2012 at 08:31:47AM -0500, Rob Herring wrote:
> On 10/11/2012 08:09 AM, Russell King - ARM Linux wrote:
> > On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >> The contents of this were already reviewed on this thread, so I sent this
> >> to the patch system and this was Russell's reply:
> > 
> > So that's why I couldn't find it - the mailing list thread has a different
> > subject line to the patch.  Don't do that.  Given the amount of list
> > traffic we have today, that's as good as not having been posted at all.
> > 
> >>> NAK for two reasons.
> >>>
> >>> 1. It hasn't been on the list (I can't find a match for "clear SCTLR.A"
> >>> in my mailbox)
> >>>
> >>> 2. The behaviour of unaligned accesses vary depending on CPU.  Some
> >>> fix-up the access, others load the word and then rotate it.  If we have
> >>> decompressors which perform unaligned accesses, we need to fix this
> >>> properly to avoid the CPU specific behaviour, rather than tweaking
> >>> control bits to hide the problem.
> >>
> >> I'm simply matching the behavior of the kernel itself. The A bit is cleared
> >> for v7 kernels and compilers only generate unaligned accesses for v7.
> >> Without this the initial state of the A bit is undefined as a bootloader
> >> could have cleared it already. We should document the required state or set
> >> it to what we want.
> > 
> > Irrespective of this, (2) still stands.  Unaligned accesses in the
> > decompressor without a fixup (which will be very hard to provide)
> > will return different data depending on the CPU as I mention in point
> > 2.
> 
> This only affects v7 cores. It should not vary for v7 cores as unaligned
> access is a required feature. So how is it going to vary on v7 CPUs?
> We've got bigger problems if there are v7 cores that don't handle
> unaligned accesses.

Rob,

Your patch may only affect v7 cores, but you've raised the issue of the
decompressor performing unaligned accesses in general.  Shall I re-repeat
my point over that or is the problem here going to finally sink in?
Russell King - ARM Linux Oct. 11, 2012, 1:59 p.m. UTC | #4
On Thu, Oct 11, 2012 at 08:31:47AM -0500, Rob Herring wrote:
> This only affects v7 cores. It should not vary for v7 cores as unaligned
> access is a required feature. So how is it going to vary on v7 CPUs?
> We've got bigger problems if there are v7 cores that don't handle
> unaligned accesses.

Oh, and this gives me a third reason to NAK this patch.  Why only ensure
that the A bit is clear for v7 CPUs?  Why not v6, v5, v4 too?  Why does
ARMv7 get this special treatment?

Any argument you can make for clearing the bit on ARMv7 also applies to
the other architectures too, but not even that negates my point (2) which
is far more fundamental.
Nicolas Pitre Oct. 11, 2012, 3:44 p.m. UTC | #5
On Thu, 11 Oct 2012, Russell King - ARM Linux wrote:

> On Thu, Oct 11, 2012 at 08:31:47AM -0500, Rob Herring wrote:
> > On 10/11/2012 08:09 AM, Russell King - ARM Linux wrote:
> > > On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> > >> The contents of this were already reviewed on this thread, so I sent this
> > >> to the patch system and this was Russell's reply:
> > > 
> > > So that's why I couldn't find it - the mailing list thread has a different
> > > subject line to the patch.  Don't do that.  Given the amount of list
> > > traffic we have today, that's as good as not having been posted at all.
> > > 
> > >>> NAK for two reasons.
> > >>>
> > >>> 1. It hasn't been on the list (I can't find a match for "clear SCTLR.A"
> > >>> in my mailbox)
> > >>>
> > >>> 2. The behaviour of unaligned accesses vary depending on CPU.  Some
> > >>> fix-up the access, others load the word and then rotate it.  If we have
> > >>> decompressors which perform unaligned accesses, we need to fix this
> > >>> properly to avoid the CPU specific behaviour, rather than tweaking
> > >>> control bits to hide the problem.
> > >>
> > >> I'm simply matching the behavior of the kernel itself. The A bit is cleared
> > >> for v7 kernels and compilers only generate unaligned accesses for v7.
> > >> Without this the initial state of the A bit is undefined as a bootloader
> > >> could have cleared it already. We should document the required state or set
> > >> it to what we want.
> > > 
> > > Irrespective of this, (2) still stands.  Unaligned accesses in the
> > > decompressor without a fixup (which will be very hard to provide)
> > > will return different data depending on the CPU as I mention in point
> > > 2.
> > 
> > This only affects v7 cores. It should not vary for v7 cores as unaligned
> > access is a required feature. So how is it going to vary on v7 CPUs?
> > We've got bigger problems if there are v7 cores that don't handle
> > unaligned accesses.
> 
> Rob,
> 
> Your patch may only affect v7 cores, but you've raised the issue of the
> decompressor performing unaligned accesses in general.  Shall I re-repeat
> my point over that or is the problem here going to finally sink in?

The decompressor is not performing direct unaligned accesses.  It uses 
the get_unaligned() and put_unaligned() accessors.  That means that 
we're in control of how this is happening.

So let's talk about the how.  On pre ARMv7, those accesses are performed 
with a series of byte accesses.  When compiling for ARMv7, gcc knows and 
that the hardware can do unaligned accesses, and it does optimize its 
output by using ldr/str instructions.  But the A bit has to be cleared 
in that case, and only in that case.  This is why the patch clears the A 
bit only for ARMv7.

So this patch is only setting up the hardware to match gcc's 
expectations when generating code from the use of get_unaligned() and 
put_unaligned() when optimizing for ARMv7.

As always, any code doing unaligned access and _not_ using those 
accessors is broken.


Nicolas
Nicolas Pitre Oct. 11, 2012, 3:58 p.m. UTC | #6
On Thu, 11 Oct 2012, Russell King - ARM Linux wrote:

> On Thu, Oct 11, 2012 at 08:31:47AM -0500, Rob Herring wrote:
> > This only affects v7 cores. It should not vary for v7 cores as unaligned
> > access is a required feature. So how is it going to vary on v7 CPUs?
> > We've got bigger problems if there are v7 cores that don't handle
> > unaligned accesses.
> 
> Oh, and this gives me a third reason to NAK this patch.  Why only ensure
> that the A bit is clear for v7 CPUs?  Why not v6, v5, v4 too?  Why does
> ARMv7 get this special treatment?

As I said, gcc knows that ARMv7 can perform word sized accesses even 
with misaligned pointers.  So when it is passed a pointer marked with 
the packed attribute, it will generate a series of byte accesses when 
compiling for anything but ARMv7, and use a single ldr or str when 
compiling for ARMv7.


Nicolas
Gregory CLEMENT Oct. 23, 2012, 8:32 p.m. UTC | #7
On 10/11/2012 05:44 PM, Nicolas Pitre wrote:
> On Thu, 11 Oct 2012, Russell King - ARM Linux wrote:
> 
>> On Thu, Oct 11, 2012 at 08:31:47AM -0500, Rob Herring wrote:
>>> On 10/11/2012 08:09 AM, Russell King - ARM Linux wrote:
>>>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
>>>>> The contents of this were already reviewed on this thread, so I sent this
>>>>> to the patch system and this was Russell's reply:
>>>>
>>>> So that's why I couldn't find it - the mailing list thread has a different
>>>> subject line to the patch.  Don't do that.  Given the amount of list
>>>> traffic we have today, that's as good as not having been posted at all.
>>>>
>>>>>> NAK for two reasons.
>>>>>>
>>>>>> 1. It hasn't been on the list (I can't find a match for "clear SCTLR.A"
>>>>>> in my mailbox)
>>>>>>
>>>>>> 2. The behaviour of unaligned accesses vary depending on CPU.  Some
>>>>>> fix-up the access, others load the word and then rotate it.  If we have
>>>>>> decompressors which perform unaligned accesses, we need to fix this
>>>>>> properly to avoid the CPU specific behaviour, rather than tweaking
>>>>>> control bits to hide the problem.
>>>>>
>>>>> I'm simply matching the behavior of the kernel itself. The A bit is cleared
>>>>> for v7 kernels and compilers only generate unaligned accesses for v7.
>>>>> Without this the initial state of the A bit is undefined as a bootloader
>>>>> could have cleared it already. We should document the required state or set
>>>>> it to what we want.
>>>>
>>>> Irrespective of this, (2) still stands.  Unaligned accesses in the
>>>> decompressor without a fixup (which will be very hard to provide)
>>>> will return different data depending on the CPU as I mention in point
>>>> 2.
>>>
>>> This only affects v7 cores. It should not vary for v7 cores as unaligned
>>> access is a required feature. So how is it going to vary on v7 CPUs?
>>> We've got bigger problems if there are v7 cores that don't handle
>>> unaligned accesses.
>>
>> Rob,
>>
>> Your patch may only affect v7 cores, but you've raised the issue of the
>> decompressor performing unaligned accesses in general.  Shall I re-repeat
>> my point over that or is the problem here going to finally sink in?
> 
> The decompressor is not performing direct unaligned accesses.  It uses 
> the get_unaligned() and put_unaligned() accessors.  That means that 
> we're in control of how this is happening.
> 
> So let's talk about the how.  On pre ARMv7, those accesses are performed 
> with a series of byte accesses.  When compiling for ARMv7, gcc knows and 
> that the hardware can do unaligned accesses, and it does optimize its 
> output by using ldr/str instructions.  But the A bit has to be cleared 
> in that case, and only in that case.  This is why the patch clears the A 
> bit only for ARMv7.
> 
> So this patch is only setting up the hardware to match gcc's 
> expectations when generating code from the use of get_unaligned() and 
> put_unaligned() when optimizing for ARMv7.
> 
> As always, any code doing unaligned access and _not_ using those 
> accessors is broken.

So is there a chance that this patch will be applied for 3.7?

Currently I can't boot anymore Armada XP or Armada 370, if kernel is
compressed in LZO. It's annoying.

Russell, did Nicolas manage to convince you?
If not, what should be the solution to fix this issue?

Thanks,

Gregory
Johannes Stezenbach Oct. 25, 2012, 9:34 a.m. UTC | #8
On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> 
> With recent compilers and move to generic unaligned.h in commit d25c881
> (ARM: 7493/1: use generic unaligned.h), unaligned accesses will be used
> by the LZO decompressor on v7 cores. So we need to make sure unaligned
> accesses are allowed by clearing the SCTLR A bit.

I just read this in http://gcc.gnu.org/gcc-4.7/changes.html:

  On ARM, when compiling for ARMv6 (but not ARMv6-M), ARMv7-A, ARMv7-R, or
  ARMv7-M, the new option -munaligned-access is active by default, which for
  some source codes generates code that accesses memory on unaligned addresses.
  This will require the kernel of those systems to enable such accesses
  (controlled by CP15 register c1, refer to ARM documentation). Alternatively
  or for compatibility with kernels where unaligned accesses are not supported,
  all code has to be compiled with -mno-unaligned-access. Linux/ARM in official
  releases has automatically and unconditionally supported unaligned accesses
  as emitted by GCC due to this option being active, since Linux version 2.6.28.

My understanding is that gcc, using the same generic unaligned.h
source code, will generate code for ARMv6 and ARMv7 that uses
unaligned access, while for ARMv5 and older it won't.  So it seems
gcc requires Linux to clear SCTLR.A and set SCTLR.U for ARMv6+.

Or add -mno-unaligned-access.

Is my understanding correct?

> While v6 can support unaligned accesses, it is optional and current
> compilers won't emit unaligned accesses. So we don't clear the A bit for
> v6.

not true according to the gcc changes page


Thanks
Johannes
Russell King - ARM Linux Oct. 25, 2012, 12:07 p.m. UTC | #9
On Tue, Oct 23, 2012 at 10:32:44PM +0200, Gregory CLEMENT wrote:
> So is there a chance that this patch will be applied for 3.7?
> 
> Currently I can't boot anymore Armada XP or Armada 370, if kernel is
> compressed in LZO. It's annoying.
> 
> Russell, did Nicolas manage to convince you?

Frankly, no - less so now that we have a question mark over ARMv6 which
seems to conflict with Nicolas' justification.  This issue is larger than
just ARMv7...
Rob Herring Oct. 25, 2012, 12:41 p.m. UTC | #10
On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
>>
>> With recent compilers and move to generic unaligned.h in commit d25c881
>> (ARM: 7493/1: use generic unaligned.h), unaligned accesses will be used
>> by the LZO decompressor on v7 cores. So we need to make sure unaligned
>> accesses are allowed by clearing the SCTLR A bit.
> 
> I just read this in http://gcc.gnu.org/gcc-4.7/changes.html:
> 
>   On ARM, when compiling for ARMv6 (but not ARMv6-M), ARMv7-A, ARMv7-R, or
>   ARMv7-M, the new option -munaligned-access is active by default, which for
>   some source codes generates code that accesses memory on unaligned addresses.
>   This will require the kernel of those systems to enable such accesses
>   (controlled by CP15 register c1, refer to ARM documentation). Alternatively
>   or for compatibility with kernels where unaligned accesses are not supported,
>   all code has to be compiled with -mno-unaligned-access. Linux/ARM in official
>   releases has automatically and unconditionally supported unaligned accesses
>   as emitted by GCC due to this option being active, since Linux version 2.6.28.

I don't think there is such a thing as ARMv6-M.

> My understanding is that gcc, using the same generic unaligned.h
> source code, will generate code for ARMv6 and ARMv7 that uses
> unaligned access, while for ARMv5 and older it won't.  So it seems
> gcc requires Linux to clear SCTLR.A and set SCTLR.U for ARMv6+.
> 
> Or add -mno-unaligned-access.
> 
> Is my understanding correct?
> 
>> While v6 can support unaligned accesses, it is optional and current
>> compilers won't emit unaligned accesses. So we don't clear the A bit for
>> v6.
> 
> not true according to the gcc changes page

What are you going to believe: documentation or what the compiler
emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
support backported and 4.7.2, unaligned accesses are emitted for v7
only. I guess default here means it is the default unless you change the
default in your build of gcc.

If we are going to do combined v6k and v7 kernels, then it would be nice
if we could get the compiler to emit unaligned accesses (assuming we
agree we can require that for v6).

Rob
Arnd Bergmann Oct. 25, 2012, 1:30 p.m. UTC | #11
On Thursday 25 October 2012, Rob Herring wrote:
> > On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >>
> >> With recent compilers and move to generic unaligned.h in commit d25c881
> >> (ARM: 7493/1: use generic unaligned.h), unaligned accesses will be used
> >> by the LZO decompressor on v7 cores. So we need to make sure unaligned
> >> accesses are allowed by clearing the SCTLR A bit.
> > 
> > I just read this in http://gcc.gnu.org/gcc-4.7/changes.html:
> > 
> >   On ARM, when compiling for ARMv6 (but not ARMv6-M), ARMv7-A, ARMv7-R, or
> >   ARMv7-M, the new option -munaligned-access is active by default, which for
> >   some source codes generates code that accesses memory on unaligned addresses.
> >   This will require the kernel of those systems to enable such accesses
> >   (controlled by CP15 register c1, refer to ARM documentation). Alternatively
> >   or for compatibility with kernels where unaligned accesses are not supported,
> >   all code has to be compiled with -mno-unaligned-access. Linux/ARM in official
> >   releases has automatically and unconditionally supported unaligned accesses
> >   as emitted by GCC due to this option being active, since Linux version 2.6.28.
> 
> I don't think there is such a thing as ARMv6-M.

There is, that would be the Cortex-M0/M0+/M1. You cannot run Linux on these.

	Arnd
Johannes Stezenbach Oct. 25, 2012, 2:16 p.m. UTC | #12
On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> > On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> > 
> >> While v6 can support unaligned accesses, it is optional and current
> >> compilers won't emit unaligned accesses. So we don't clear the A bit for
> >> v6.
> > 
> > not true according to the gcc changes page
> 
> What are you going to believe: documentation or what the compiler
> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> support backported and 4.7.2, unaligned accesses are emitted for v7
> only. I guess default here means it is the default unless you change the
> default in your build of gcc.

Since ARMv6 can handle unaligned access in the same way as ARMv7
it seems a clear bug in gcc which might hopefully get fixed.
Thus in this case I think it is reasonable to follow the
gcc documentation, otherwise the code would break for ARMv6
when gcc gets fixed.


Johannes
Rob Herring Oct. 25, 2012, 2:25 p.m. UTC | #13
On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
>> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
>>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
>>>
>>>> While v6 can support unaligned accesses, it is optional and current
>>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
>>>> v6.
>>>
>>> not true according to the gcc changes page
>>
>> What are you going to believe: documentation or what the compiler
>> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
>> support backported and 4.7.2, unaligned accesses are emitted for v7
>> only. I guess default here means it is the default unless you change the
>> default in your build of gcc.
> 
> Since ARMv6 can handle unaligned access in the same way as ARMv7
> it seems a clear bug in gcc which might hopefully get fixed.
> Thus in this case I think it is reasonable to follow the
> gcc documentation, otherwise the code would break for ARMv6
> when gcc gets fixed.

But the compiler can't assume the state of the U bit. I think it is
still legal on v6 to not support unaligned accesses, but on v7 it is
required. All the standard v6 ARM cores support it, but I'm not sure
about custom cores or if there are SOCs with buses that don't support
unaligned accesses properly.

Rob
Nicolas Pitre Oct. 25, 2012, 3:02 p.m. UTC | #14
On Thu, 25 Oct 2012, Rob Herring wrote:

> On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> > On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> >> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> >>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >>>
> >>>> While v6 can support unaligned accesses, it is optional and current
> >>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> >>>> v6.
> >>>
> >>> not true according to the gcc changes page
> >>
> >> What are you going to believe: documentation or what the compiler
> >> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> >> support backported and 4.7.2, unaligned accesses are emitted for v7
> >> only. I guess default here means it is the default unless you change the
> >> default in your build of gcc.
> > 
> > Since ARMv6 can handle unaligned access in the same way as ARMv7
> > it seems a clear bug in gcc which might hopefully get fixed.
> > Thus in this case I think it is reasonable to follow the
> > gcc documentation, otherwise the code would break for ARMv6
> > when gcc gets fixed.
> 
> But the compiler can't assume the state of the U bit. I think it is
> still legal on v6 to not support unaligned accesses, but on v7 it is
> required. All the standard v6 ARM cores support it, but I'm not sure
> about custom cores or if there are SOCs with buses that don't support
> unaligned accesses properly.

The fact is that gcc assumes the A bit is cleared when generating code 
for ARMv7, period.  We need to conform our environment to gcc 
expectations, or always compile the decompressor using a lower 
architecture level than ARMv7.  My preference is the former.


Nicolas
Johannes Stezenbach Oct. 25, 2012, 3:08 p.m. UTC | #15
On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> > On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> >> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> >>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >>>
> >>>> While v6 can support unaligned accesses, it is optional and current
> >>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> >>>> v6.
> >>>
> >>> not true according to the gcc changes page
> >>
> >> What are you going to believe: documentation or what the compiler
> >> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> >> support backported and 4.7.2, unaligned accesses are emitted for v7
> >> only. I guess default here means it is the default unless you change the
> >> default in your build of gcc.
> > 
> > Since ARMv6 can handle unaligned access in the same way as ARMv7
> > it seems a clear bug in gcc which might hopefully get fixed.
> > Thus in this case I think it is reasonable to follow the
> > gcc documentation, otherwise the code would break for ARMv6
> > when gcc gets fixed.
> 
> But the compiler can't assume the state of the U bit. I think it is
> still legal on v6 to not support unaligned accesses, but on v7 it is
> required. All the standard v6 ARM cores support it, but I'm not sure
> about custom cores or if there are SOCs with buses that don't support
> unaligned accesses properly.

Well, I read the "...since Linux version 2.6.28" comment
in the gcc changes page in the way that they assume the
U-bit is set. (Although I'm not sure it really is???)

Looking at it from another side, if using the hw unaligned
access capability gives a performance benefit then it
would be useful to support it even if gcc doesn't
behave as documented.  One could still use a special
unaligned.h for ARMv6 to get the performance benefit.
(If it doesn't give performance benfit, then why
bother, let's just use -mno-unaligned-access for v7, too.)

In the ARMv6 ARM unaligned support and the U-bit is
not optional, so you could use the same SoC bus argument
for some hypothetical v7 SoCs.


Johannes
tip-bot for Dave Martin Nov. 5, 2012, 10:48 a.m. UTC | #16
On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
> On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> > On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> > > On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> > >> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> > >>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> > >>>
> > >>>> While v6 can support unaligned accesses, it is optional and current
> > >>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> > >>>> v6.
> > >>>
> > >>> not true according to the gcc changes page
> > >>
> > >> What are you going to believe: documentation or what the compiler
> > >> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> > >> support backported and 4.7.2, unaligned accesses are emitted for v7
> > >> only. I guess default here means it is the default unless you change the
> > >> default in your build of gcc.
> > > 
> > > Since ARMv6 can handle unaligned access in the same way as ARMv7
> > > it seems a clear bug in gcc which might hopefully get fixed.
> > > Thus in this case I think it is reasonable to follow the
> > > gcc documentation, otherwise the code would break for ARMv6
> > > when gcc gets fixed.
> > 
> > But the compiler can't assume the state of the U bit. I think it is
> > still legal on v6 to not support unaligned accesses, but on v7 it is
> > required. All the standard v6 ARM cores support it, but I'm not sure
> > about custom cores or if there are SOCs with buses that don't support
> > unaligned accesses properly.
> 
> Well, I read the "...since Linux version 2.6.28" comment
> in the gcc changes page in the way that they assume the
> U-bit is set. (Although I'm not sure it really is???)

Actually, the kernel checks the arch version and the U bit on boot,
and chooses the appropriate setting for the A bit depending on the
result.  (See arch/arm/mm/alignment.c:alignment_init().)

Currently, we depend on the CPU reset behaviour or firmware/
bootloader to set the U bit for v6, but the behaviour should be
correct either way, though unaligned accesses will obviously
perform (much) better with U=1.

From the compiler's point of view we have always implemented the
U=1 behaviour, but it has to be done via the alignment fault
handler prior to v6 or with U=0.

> Looking at it from another side, if using the hw unaligned
> access capability gives a performance benefit then it
> would be useful to support it even if gcc doesn't
> behave as documented.  One could still use a special
> unaligned.h for ARMv6 to get the performance benefit.
> (If it doesn't give performance benfit, then why
> bother, let's just use -mno-unaligned-access for v7, too.)

For v7, we should definitely use -munaligned-access where available
(unless it's the default?)

For v6, the question is whether there is any legitimate reason
ever to run the kernel with U=0.  If not, could we explicitly
set it early and build with -munaligned-access where the compiler
supports this?

The only counterargument I can think of is that some people might
be running some ancient userspace which actually relies on the U=0
behaviour.  I don't know whether anyone is actually doing that on
v6, though.

Cheers
---Dave

> In the ARMv6 ARM unaligned support and the U-bit is
> not optional, so you could use the same SoC bus argument
> for some hypothetical v7 SoCs.
> 
> 
> Johannes
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Russell King - ARM Linux Nov. 5, 2012, 11:13 a.m. UTC | #17
On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
> On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
> > On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> > > On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> > > > On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> > > >> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> > > >>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> > > >>>
> > > >>>> While v6 can support unaligned accesses, it is optional and current
> > > >>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> > > >>>> v6.
> > > >>>
> > > >>> not true according to the gcc changes page
> > > >>
> > > >> What are you going to believe: documentation or what the compiler
> > > >> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> > > >> support backported and 4.7.2, unaligned accesses are emitted for v7
> > > >> only. I guess default here means it is the default unless you change the
> > > >> default in your build of gcc.
> > > > 
> > > > Since ARMv6 can handle unaligned access in the same way as ARMv7
> > > > it seems a clear bug in gcc which might hopefully get fixed.
> > > > Thus in this case I think it is reasonable to follow the
> > > > gcc documentation, otherwise the code would break for ARMv6
> > > > when gcc gets fixed.
> > > 
> > > But the compiler can't assume the state of the U bit. I think it is
> > > still legal on v6 to not support unaligned accesses, but on v7 it is
> > > required. All the standard v6 ARM cores support it, but I'm not sure
> > > about custom cores or if there are SOCs with buses that don't support
> > > unaligned accesses properly.
> > 
> > Well, I read the "...since Linux version 2.6.28" comment
> > in the gcc changes page in the way that they assume the
> > U-bit is set. (Although I'm not sure it really is???)
> 
> Actually, the kernel checks the arch version and the U bit on boot,
> and chooses the appropriate setting for the A bit depending on the
> result.  (See arch/arm/mm/alignment.c:alignment_init().)

That is in the kernel itself, _after_ the decompressor has run.  It is
not relevant to any discussion about the decompressor.

> Currently, we depend on the CPU reset behaviour or firmware/
> bootloader to set the U bit for v6, but the behaviour should be
> correct either way, though unaligned accesses will obviously
> perform (much) better with U=1.

Will someone _PLEASE_ address my initial comments against this patch
in light of the fact that it's now been proven _NOT_ to be just a V7
issue, rather than everyone seemingly buring their heads in the sand
over this.

The fact is, unaligned accesses in the decompressor are *undefined* at
present.

> For v7, we should definitely use -munaligned-access where available
> (unless it's the default?)

No such option on my compiler - according to the manual I have, the only
option there is starting -munaligned is on SPARC for -munaligned-doubles.

However, I believe GCC does believe that unaligned accesses are fine on
V6 and above.
tip-bot for Dave Martin Nov. 5, 2012, 1:02 p.m. UTC | #18
On Mon, Nov 05, 2012 at 11:13:46AM +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
> > On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
> > > On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> > > > On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> > > > > On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> > > > >> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> > > > >>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> > > > >>>
> > > > >>>> While v6 can support unaligned accesses, it is optional and current
> > > > >>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> > > > >>>> v6.
> > > > >>>
> > > > >>> not true according to the gcc changes page
> > > > >>
> > > > >> What are you going to believe: documentation or what the compiler
> > > > >> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> > > > >> support backported and 4.7.2, unaligned accesses are emitted for v7
> > > > >> only. I guess default here means it is the default unless you change the
> > > > >> default in your build of gcc.
> > > > > 
> > > > > Since ARMv6 can handle unaligned access in the same way as ARMv7
> > > > > it seems a clear bug in gcc which might hopefully get fixed.
> > > > > Thus in this case I think it is reasonable to follow the
> > > > > gcc documentation, otherwise the code would break for ARMv6
> > > > > when gcc gets fixed.
> > > > 
> > > > But the compiler can't assume the state of the U bit. I think it is
> > > > still legal on v6 to not support unaligned accesses, but on v7 it is
> > > > required. All the standard v6 ARM cores support it, but I'm not sure
> > > > about custom cores or if there are SOCs with buses that don't support
> > > > unaligned accesses properly.
> > > 
> > > Well, I read the "...since Linux version 2.6.28" comment
> > > in the gcc changes page in the way that they assume the
> > > U-bit is set. (Although I'm not sure it really is???)
> > 
> > Actually, the kernel checks the arch version and the U bit on boot,
> > and chooses the appropriate setting for the A bit depending on the
> > result.  (See arch/arm/mm/alignment.c:alignment_init().)
> 
> That is in the kernel itself, _after_ the decompressor has run.  It is
> not relevant to any discussion about the decompressor.

This was merely meant as an argument that the kernel does not make
assumptions about the U bit, and so the zImage decompressor probably
shouldn't either.

> > Currently, we depend on the CPU reset behaviour or firmware/
> > bootloader to set the U bit for v6, but the behaviour should be
> > correct either way, though unaligned accesses will obviously
> > perform (much) better with U=1.
> 
> Will someone _PLEASE_ address my initial comments against this patch
> in light of the fact that it's now been proven _NOT_ to be just a V7
> issue, rather than everyone seemingly buring their heads in the sand
> over this.
> 
> The fact is, unaligned accesses in the decompressor are *undefined* at
> present.

Why not allow unaligned accesses in the decompressor, though, both
for v6 and v7?

The decompressors run with the cache on, so we should have fault-free
unaligned access on all CPUs which support it, provided that we set the
SCTLR bits appropriately, and provided that the decompressers are
written in correct C.

The counterargument is that zImage would then just fall over if booted
on a CPU older than the baseline architecture the kernel was built for.
However, that should be detected before executing any C code, giving
the zImage code a chance to bail out with a suitable error message.

A kernel which contains support for v6/v7 platforms cannot work on older
CPUs anyway, because the kernel itself doesn't support such combinations.

> > For v7, we should definitely use -munaligned-access where available
> > (unless it's the default?)
> 
> No such option on my compiler - according to the manual I have, the only
> option there is starting -munaligned is on SPARC for -munaligned-doubles.

OK, I guess that's something backported into the Linaro toolchain I'm
currently using then.  But it seems a good idea to use this if available,
because it allows the compiler to generate better code in some situations,
especially for packed struct access.
 
> However, I believe GCC does believe that unaligned accesses are fine on
> V6 and above.

Possibly, but I've never seen it use them deliberately, prior to the
-munaligned-access support.

Cheers
---Dave
Johannes Stezenbach Nov. 5, 2012, 1:43 p.m. UTC | #19
On Mon, Nov 05, 2012 at 01:02:55PM +0000, Dave Martin wrote:
> On Mon, Nov 05, 2012 at 11:13:46AM +0000, Russell King - ARM Linux wrote:
> > On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
> > > For v7, we should definitely use -munaligned-access where available
> > > (unless it's the default?)
> > 
> > No such option on my compiler - according to the manual I have, the only
> > option there is starting -munaligned is on SPARC for -munaligned-doubles.
> 
> OK, I guess that's something backported into the Linaro toolchain I'm
> currently using then.  But it seems a good idea to use this if available,
> because it allows the compiler to generate better code in some situations,
> especially for packed struct access.
>  
> > However, I believe GCC does believe that unaligned accesses are fine on
> > V6 and above.
> 
> Possibly, but I've never seen it use them deliberately, prior to the
> -munaligned-access support.

http://gcc.gnu.org/gcc-4.7/changes.html says -munaligned-access is
enabled by default.

I haven't had a chance to try gcc-4.7 yet, but gcc-4.6+linaro
has the option but fails to generate the expected code for ARMv6
while it works for ARMv7.  However it does

#define __ARM_FEATURE_UNALIGNED 1

and

        .eabi_attribute 34, 1

This seems to be the commit which introduced unaligned support in gcc:
http://repo.or.cz/w/official-gcc.git/commitdiff/eb04cafba3a6f1eddbdb5ec031d8a7074930d5b9
I cannot figure out why this works for v7 but not for v6.

(I used gcc-4.6.4 20121001 (prerelease) toolchain built with crosstool-NG.)


Johannes
Rob Herring Nov. 5, 2012, 1:48 p.m. UTC | #20
On 11/05/2012 05:13 AM, Russell King - ARM Linux wrote:
> On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
>> On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
>>> On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
>>>> On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
>>>>> On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
>>>>>> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
>>>>>>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
>>>>>>>
>>>>>>>> While v6 can support unaligned accesses, it is optional and current
>>>>>>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
>>>>>>>> v6.
>>>>>>>
>>>>>>> not true according to the gcc changes page
>>>>>>
>>>>>> What are you going to believe: documentation or what the compiler
>>>>>> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
>>>>>> support backported and 4.7.2, unaligned accesses are emitted for v7
>>>>>> only. I guess default here means it is the default unless you change the
>>>>>> default in your build of gcc.
>>>>>
>>>>> Since ARMv6 can handle unaligned access in the same way as ARMv7
>>>>> it seems a clear bug in gcc which might hopefully get fixed.
>>>>> Thus in this case I think it is reasonable to follow the
>>>>> gcc documentation, otherwise the code would break for ARMv6
>>>>> when gcc gets fixed.
>>>>
>>>> But the compiler can't assume the state of the U bit. I think it is
>>>> still legal on v6 to not support unaligned accesses, but on v7 it is
>>>> required. All the standard v6 ARM cores support it, but I'm not sure
>>>> about custom cores or if there are SOCs with buses that don't support
>>>> unaligned accesses properly.
>>>
>>> Well, I read the "...since Linux version 2.6.28" comment
>>> in the gcc changes page in the way that they assume the
>>> U-bit is set. (Although I'm not sure it really is???)
>>
>> Actually, the kernel checks the arch version and the U bit on boot,
>> and chooses the appropriate setting for the A bit depending on the
>> result.  (See arch/arm/mm/alignment.c:alignment_init().)
> 
> That is in the kernel itself, _after_ the decompressor has run.  It is
> not relevant to any discussion about the decompressor.
> 
>> Currently, we depend on the CPU reset behaviour or firmware/
>> bootloader to set the U bit for v6, but the behaviour should be
>> correct either way, though unaligned accesses will obviously
>> perform (much) better with U=1.
> 
> Will someone _PLEASE_ address my initial comments against this patch
> in light of the fact that it's now been proven _NOT_ to be just a V7
> issue, rather than everyone seemingly buring their heads in the sand
> over this.

I tried adding -munaligned-accesses on a v6 build and still get byte
accesses rather than unaligned word accesses. So this does seem to be a
v7 only issue based on what gcc will currently produce. Copying Michael
Hope who can hopefully provide some insight on why v6 unaligned accesses
are not enabled.

> The fact is, unaligned accesses in the decompressor are *undefined* at
> present.
> 
>> For v7, we should definitely use -munaligned-access where available
>> (unless it's the default?)
> 
> No such option on my compiler - according to the manual I have, the only
> option there is starting -munaligned is on SPARC for -munaligned-doubles.

It's only added in 4.7 and backported to Linaro 4.6.3.

Rob

> However, I believe GCC does believe that unaligned accesses are fine on
> V6 and above.
>
Russell King - ARM Linux Nov. 5, 2012, 3:39 p.m. UTC | #21
On Mon, Nov 05, 2012 at 01:02:55PM +0000, Dave Martin wrote:
> Why not allow unaligned accesses in the decompressor, though, both
> for v6 and v7?

EXACTLY.
Nicolas Pitre Nov. 5, 2012, 4:13 p.m. UTC | #22
On Mon, 5 Nov 2012, Russell King - ARM Linux wrote:

> On Mon, Nov 05, 2012 at 01:02:55PM +0000, Dave Martin wrote:
> > Why not allow unaligned accesses in the decompressor, though, both
> > for v6 and v7?
> 
> EXACTLY.

I have no objections to that.  In fact, I made a remark to this effect 
in my initial review of this patch.  Whether or not gcc does take 
advantage of this hardware ability in the end is orthogonal.


Nicolas
Michael Hope Nov. 5, 2012, 10:02 p.m. UTC | #23
On 6 November 2012 02:48, Rob Herring <robherring2@gmail.com> wrote:
>
> On 11/05/2012 05:13 AM, Russell King - ARM Linux wrote:
> > On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
> >> On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
> >>> On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> >>>> On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> >>>>> On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> >>>>>> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> >>>>>>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >>>>>>>
> >>>>>>>> While v6 can support unaligned accesses, it is optional and current
> >>>>>>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> >>>>>>>> v6.
> >>>>>>>
> >>>>>>> not true according to the gcc changes page
> >>>>>>
> >>>>>> What are you going to believe: documentation or what the compiler
> >>>>>> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> >>>>>> support backported and 4.7.2, unaligned accesses are emitted for v7
> >>>>>> only. I guess default here means it is the default unless you change the
> >>>>>> default in your build of gcc.
> >>>>>
> >>>>> Since ARMv6 can handle unaligned access in the same way as ARMv7
> >>>>> it seems a clear bug in gcc which might hopefully get fixed.
> >>>>> Thus in this case I think it is reasonable to follow the
> >>>>> gcc documentation, otherwise the code would break for ARMv6
> >>>>> when gcc gets fixed.
> >>>>
> >>>> But the compiler can't assume the state of the U bit. I think it is
> >>>> still legal on v6 to not support unaligned accesses, but on v7 it is
> >>>> required. All the standard v6 ARM cores support it, but I'm not sure
> >>>> about custom cores or if there are SOCs with buses that don't support
> >>>> unaligned accesses properly.
> >>>
> >>> Well, I read the "...since Linux version 2.6.28" comment
> >>> in the gcc changes page in the way that they assume the
> >>> U-bit is set. (Although I'm not sure it really is???)
> >>
> >> Actually, the kernel checks the arch version and the U bit on boot,
> >> and chooses the appropriate setting for the A bit depending on the
> >> result.  (See arch/arm/mm/alignment.c:alignment_init().)
> >
> > That is in the kernel itself, _after_ the decompressor has run.  It is
> > not relevant to any discussion about the decompressor.
> >
> >> Currently, we depend on the CPU reset behaviour or firmware/
> >> bootloader to set the U bit for v6, but the behaviour should be
> >> correct either way, though unaligned accesses will obviously
> >> perform (much) better with U=1.
> >
> > Will someone _PLEASE_ address my initial comments against this patch
> > in light of the fact that it's now been proven _NOT_ to be just a V7
> > issue, rather than everyone seemingly buring their heads in the sand
> > over this.
>
> I tried adding -munaligned-accesses on a v6 build and still get byte
> accesses rather than unaligned word accesses. So this does seem to be a
> v7 only issue based on what gcc will currently produce. Copying Michael
> Hope who can hopefully provide some insight on why v6 unaligned accesses
> are not enabled.

This looks like a bug.  Unaligned access is enabled for armv6 but
seems to only take effect for cores with Thumb-2.  Here's a test case
both with unaligned field access and unaligned block copy:

struct foo
{
  char a;
  int b;
  struct
  {
    int x[3];
  } c;
} __attribute__((packed));

int get_field(struct foo *p)
{
  return p->b;
}

int copy_block(struct foo *p, struct foo *q)
{
  p->c = q->c;
}

With -march=armv7-a you get the correct:

bar:
	ldr	r0, [r0, #1]	@ unaligned	@ 11	unaligned_loadsi/2	[length = 4]
	bx	lr	@ 21	*arm_return	[length = 12]

baz:
	str	r4, [sp, #-4]!	@ 25	*push_multi	[length = 4]
	mov	r2, r0	@ 2	*arm_movsi_vfp/1	[length = 4]
	ldr	r4, [r1, #5]!	@ unaligned	@ 9	unaligned_loadsi/2	[length = 4]
	ldr	ip, [r1, #4]	@ unaligned	@ 10	unaligned_loadsi/2	[length = 4]
	ldr	r1, [r1, #8]	@ unaligned	@ 11	unaligned_loadsi/2	[length = 4]
	str	r4, [r2, #5]	@ unaligned	@ 12	unaligned_storesi/2	[length = 4]
	str	ip, [r2, #9]	@ unaligned	@ 13	unaligned_storesi/2	[length = 4]
	str	r1, [r2, #13]	@ unaligned	@ 14	unaligned_storesi/2	[length = 4]
	ldmfd	sp!, {r4}
	bx	lr

With -march=armv6 you get a byte-by-byte field access and a correct
unaligned block copy:

bar:
	ldrb	r1, [r0, #2]	@ zero_extendqisi2
	ldrb	r3, [r0, #1]	@ zero_extendqisi2
	ldrb	r2, [r0, #3]	@ zero_extendqisi2
	ldrb	r0, [r0, #4]	@ zero_extendqisi2
	orr	r3, r3, r1, asl #8
	orr	r3, r3, r2, asl #16
	orr	r0, r3, r0, asl #24
	bx	lr

baz:
	str	r4, [sp, #-4]!
	mov	r2, r0
	ldr	r4, [r1, #5]!	@ unaligned
	ldr	ip, [r1, #4]	@ unaligned
	ldr	r1, [r1, #8]	@ unaligned
	str	r4, [r2, #5]	@ unaligned
	str	ip, [r2, #9]	@ unaligned
	str	r1, [r2, #13]	@ unaligned
	ldmfd	sp!, {r4}
	bx	lr

readelf -A shows that the compiler planned to use unaligned access in
both.  My suspicion is that GCC is using the extv pattern to extract
the field from memory, and that pattern is only enabled for Thumb-2
capable cores.

I've logged PR55218.  We'll discuss it at our next meeting.

-- Michael
Johannes Stezenbach Feb. 20, 2013, 2:56 p.m. UTC | #24
Replying to old thread, for full context see
http://thread.gmane.org/gmane.linux.ports.arm.kernel/180226/focus=197914

On Tue, Nov 06, 2012 at 11:02:27AM +1300, Michael Hope wrote:
> On 6 November 2012 02:48, Rob Herring <robherring2@gmail.com> wrote:
> >
> > I tried adding -munaligned-accesses on a v6 build and still get byte
> > accesses rather than unaligned word accesses. So this does seem to be a
> > v7 only issue based on what gcc will currently produce. Copying Michael
> > Hope who can hopefully provide some insight on why v6 unaligned accesses
> > are not enabled.
> 
> This looks like a bug.  Unaligned access is enabled for armv6 but
> seems to only take effect for cores with Thumb-2.  Here's a test case
> both with unaligned field access and unaligned block copy:
> 
> struct foo
> {
>   char a;
>   int b;
>   struct
>   {
>     int x[3];
>   } c;
> } __attribute__((packed));
> 
> int get_field(struct foo *p)
> {
>   return p->b;
> }
> 
> int copy_block(struct foo *p, struct foo *q)
> {
>   p->c = q->c;
> }
> 
> With -march=armv7-a you get the correct:
> 
> bar:
> 	ldr	r0, [r0, #1]	@ unaligned	@ 11	unaligned_loadsi/2	[length = 4]
> 	bx	lr	@ 21	*arm_return	[length = 12]
> 
> baz:
> 	str	r4, [sp, #-4]!	@ 25	*push_multi	[length = 4]
> 	mov	r2, r0	@ 2	*arm_movsi_vfp/1	[length = 4]
> 	ldr	r4, [r1, #5]!	@ unaligned	@ 9	unaligned_loadsi/2	[length = 4]
> 	ldr	ip, [r1, #4]	@ unaligned	@ 10	unaligned_loadsi/2	[length = 4]
> 	ldr	r1, [r1, #8]	@ unaligned	@ 11	unaligned_loadsi/2	[length = 4]
> 	str	r4, [r2, #5]	@ unaligned	@ 12	unaligned_storesi/2	[length = 4]
> 	str	ip, [r2, #9]	@ unaligned	@ 13	unaligned_storesi/2	[length = 4]
> 	str	r1, [r2, #13]	@ unaligned	@ 14	unaligned_storesi/2	[length = 4]
> 	ldmfd	sp!, {r4}
> 	bx	lr
> 
> With -march=armv6 you get a byte-by-byte field access and a correct
> unaligned block copy:
> 
> bar:
> 	ldrb	r1, [r0, #2]	@ zero_extendqisi2
> 	ldrb	r3, [r0, #1]	@ zero_extendqisi2
> 	ldrb	r2, [r0, #3]	@ zero_extendqisi2
> 	ldrb	r0, [r0, #4]	@ zero_extendqisi2
> 	orr	r3, r3, r1, asl #8
> 	orr	r3, r3, r2, asl #16
> 	orr	r0, r3, r0, asl #24
> 	bx	lr
> 
> baz:
> 	str	r4, [sp, #-4]!
> 	mov	r2, r0
> 	ldr	r4, [r1, #5]!	@ unaligned
> 	ldr	ip, [r1, #4]	@ unaligned
> 	ldr	r1, [r1, #8]	@ unaligned
> 	str	r4, [r2, #5]	@ unaligned
> 	str	ip, [r2, #9]	@ unaligned
> 	str	r1, [r2, #13]	@ unaligned
> 	ldmfd	sp!, {r4}
> 	bx	lr
> 
> readelf -A shows that the compiler planned to use unaligned access in
> both.  My suspicion is that GCC is using the extv pattern to extract
> the field from memory, and that pattern is only enabled for Thumb-2
> capable cores.
> 
> I've logged PR55218.  We'll discuss it at our next meeting.

Just tried with gcc-linaro-4.7-2013.01 (gcc-4.7.3 20130102 (prerelease)),
the issue is still unfixed.  Do you have any idea how to fix it?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55218

Thanks,
Johannes
Johannes Stezenbach Feb. 20, 2013, 3:07 p.m. UTC | #25
On Wed, Feb 20, 2013 at 03:56:54PM +0100, Johannes Stezenbach wrote:
> Replying to old thread, for full context see
> http://thread.gmane.org/gmane.linux.ports.arm.kernel/180226/focus=197914
> 
> On Tue, Nov 06, 2012 at 11:02:27AM +1300, Michael Hope wrote:

Oh, mail to michael.hope@linaro.org bounces (account disabled).
That might explain why the bug is not fixed :-(

Johannes
diff mbox

Patch

diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index bc67cbf..b2e30b8 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -654,6 +654,7 @@  __armv7_mmu_cache_on:
 #endif
 		mrc	p15, 0, r0, c1, c0, 0	@ read control reg
 		bic	r0, r0, #1 << 28	@ clear SCTLR.TRE
+		bic	r0, r0, #1 << 1		@ clear SCTLR.A
 		orr	r0, r0, #0x5000		@ I-cache enable, RR cache replacement
 		orr	r0, r0, #0x003c		@ write buffer
 #ifdef CONFIG_MMU