diff mbox

Do I need to invalidate caches before enabling (on ARMv7)?

Message ID 20141206210031.GA1758@afzal-ThinkPad-R50e (mailing list archive)
State New, archived
Headers show

Commit Message

afzal mohammed Dec. 6, 2014, 9 p.m. UTC
Hi Uwe,

On Thu, Dec 04, 2014 at 11:03:57AM +0100, Uwe Kleine-König wrote:

> The breakage I'm currently seeing in barebox might well be explained by
> stale I-cache entries and barebox (as of now) doesn't invalidate the
> i-cache before enabling it. Looking into how Linux enables the I-cache
> in the decompressor for v7[1] revealed that the caches are not cleaned
> there either. (So my plan to copy from Linux failed :-)
> 
> Now I wonder if that is only an unlikely (or even theoretical) issue
> that wasn't noticed up to now or if I'm missing something.

> So stale entries in the cache might even hurt before the cache is
> enabled?! This would mean that you want to invalidate/flush the cache at
> disable-time. Still I think doing it before enabling it in Linux would
> be a good idea. And if it's only because bootloaders and (maybe worse)
> boot roms cannot be trusted in this area.

Are you sure that stale I-cache is causing the issue ?, but
D-cache should not have stale data - it is a pre-requisite for
booting the Kernel [1,2] (though not in Booting documentation)

We were troubled by this issue when Kernel was loaded directly w/o
bootloader, since D-cache was not invalidated, upon enabling,
due to stale data Kernel was crashing randomly

Did want to patch it for long time, but then with cash invalidation
and all it never happened ;)

Regards
Afzal

[1] http://www.arm.linux.org.uk/developer/booting.php
[2] http://comments.gmane.org/gmane.linux.ports.arm.kernel/77718

---------------------------------8<--------------------------------------

Comments

Uwe Kleine-König Dec. 7, 2014, 10:01 a.m. UTC | #1
Hello,

On Sun, Dec 07, 2014 at 02:30:31AM +0530, Afzal Mohammed wrote:
> On Thu, Dec 04, 2014 at 11:03:57AM +0100, Uwe Kleine-König wrote:
> 
> > The breakage I'm currently seeing in barebox might well be explained by
> > stale I-cache entries and barebox (as of now) doesn't invalidate the
> > i-cache before enabling it. Looking into how Linux enables the I-cache
> > in the decompressor for v7[1] revealed that the caches are not cleaned
> > there either. (So my plan to copy from Linux failed :-)
> > 
> > Now I wonder if that is only an unlikely (or even theoretical) issue
> > that wasn't noticed up to now or if I'm missing something.
> 
> > So stale entries in the cache might even hurt before the cache is
> > enabled?! This would mean that you want to invalidate/flush the cache at
> > disable-time. Still I think doing it before enabling it in Linux would
> > be a good idea. And if it's only because bootloaders and (maybe worse)
> > boot roms cannot be trusted in this area.
> 
> Are you sure that stale I-cache is causing the issue ?, but
No, at least it's not the only problem. I found another cache related
problem, similar to yours. Still I think it's the right thing to do (at
least in the bootloader) to invalidate the cache before enabling.

> D-cache should not have stale data - it is a pre-requisite for
> booting the Kernel [1,2] (though not in Booting documentation)
> 
> We were troubled by this issue when Kernel was loaded directly w/o
> bootloader, since D-cache was not invalidated, upon enabling,
> due to stale data Kernel was crashing randomly
> 
> Did want to patch it for long time, but then with cash invalidation
What is cash invalidation? Do you rupture bank notes? (SCNR)

> and all it never happened ;)
> 
> Regards
> Afzal
> 
> [1] http://www.arm.linux.org.uk/developer/booting.php
> [2] http://comments.gmane.org/gmane.linux.ports.arm.kernel/77718
> 
> ---------------------------------8<--------------------------------------
> diff --git a/Documentation/arm/Booting b/Documentation/arm/Booting
> index 371814a36719..c4c423164d5e 100644
> --- a/Documentation/arm/Booting
> +++ b/Documentation/arm/Booting
> @@ -193,6 +193,7 @@ In any case, the following conditions must be met:
>    The MMU must be off.
>    Instruction cache may be on or off.
>    Data cache must be off.
> +  Data cache should be invalidated.
That would match http://www.arm.linux.org.uk/developer/booting.php which
is good. I'd prefer if Linux handled stale data in the cache though for
two reasons:
 - Usually bootloaders are not perfect in adhering to requirements that
   are not obvious from testing.
 - Stale caches introduce problems that are hard to debug.

Best regards
Uwe
Andrew Lunn Dec. 7, 2014, 10:58 a.m. UTC | #2
> That would match http://www.arm.linux.org.uk/developer/booting.php which
> is good. I'd prefer if Linux handled stale data in the cache though for
> two reasons:
>  - Usually bootloaders are not perfect in adhering to requirements that
>    are not obvious from testing.
>  - Stale caches introduce problems that are hard to debug.

Can it handle stale data?

We had an issue with kirkwood boards with u-boot leaving the cache on
when jumping into Linux. This causes corruption in the decompressed
image as it was being decompressed. So by the time the kernel was
running, it was too late, bad things had already happened and death
was coming soon.

There is clearly a difference between leaving caches on, and turning
them off but not cleaning them. So maybe it is possible for the kernel
to handle this?

   Andrew
Uwe Kleine-König Dec. 7, 2014, 11:11 a.m. UTC | #3
On Sun, Dec 07, 2014 at 11:58:23AM +0100, Andrew Lunn wrote:
> > That would match http://www.arm.linux.org.uk/developer/booting.php which
> > is good. I'd prefer if Linux handled stale data in the cache though for
> > two reasons:
> >  - Usually bootloaders are not perfect in adhering to requirements that
> >    are not obvious from testing.
> >  - Stale caches introduce problems that are hard to debug.
> 
> Can it handle stale data?
Yes, it can provided the cache is off adn the kernel image is correctly
flushed to RAM.

BTW that is the state that at least some Tegra machines come out of
reset with. That's why barebox invalidates the data cache before
enabling it.

> We had an issue with kirkwood boards with u-boot leaving the cache on
> when jumping into Linux. This causes corruption in the decompressed
> image as it was being decompressed. So by the time the kernel was
> running, it was too late, bad things had already happened and death
> was coming soon.
> 
> There is clearly a difference between leaving caches on, and turning
> them off but not cleaning them. So maybe it is possible for the kernel
> to handle this?
Even handling the cache being on would be possible, that would result in
something like:

	if cache_is_on():
		# assume everything is fine, no stale entries
	else:
		invalidate_cache()
		enable_cache()

not sure we want that complexity though. Still I think going from

	enable_cache()

as it is now to

	invalidate_cache()
	enable_cache()

would be nice. This would for example handle the case that the
bootloader on Tegra and machines with similar requirements doesn't make
use of the cache at all.

Best regards
Uwe
afzal mohammed Dec. 7, 2014, 11:26 a.m. UTC | #4
Hi,

On Sun, Dec 07, 2014 at 11:01:24AM +0100, Uwe Kleine-König wrote:

> > Did want to patch it for long time, but then with cash invalidation
> What is cash invalidation? Do you rupture bank notes? (SCNR)

cash = contract

Regards
Afzal
Andrew Lunn Dec. 7, 2014, 11:27 a.m. UTC | #5
> Even handling the cache being on would be possible, that would result in
> something like:

We came to the conclusion it is not possible to handle it. Something
to do with the MMU i think, but i don't remember. There is a thread in
the archive, and for the platform which i know is broken, the dts file
points to:

https://lists.debian.org/debian-arm/2012/08/msg00128.html

which contains instructions how to append a few bytes of code to the
front of the image to turn the cache off.


	Andrew
Uwe Kleine-König Dec. 7, 2014, 11:38 a.m. UTC | #6
Hey Andrew,

On Sun, Dec 07, 2014 at 12:27:19PM +0100, Andrew Lunn wrote:
> > Even handling the cache being on would be possible, that would result in
> > something like:
> 
> We came to the conclusion it is not possible to handle it. Something
> to do with the MMU i think, but i don't remember. There is a thread in
> the archive, and for the platform which i know is broken, the dts file
> points to:
> 
> https://lists.debian.org/debian-arm/2012/08/msg00128.html
> 
> which contains instructions how to append a few bytes of code to the
> front of the image to turn the cache off.
I remember the problem. It's about L2 caches here which are not used by
the decompressor.

It's similar to the L1 caches that are in use: If the bootloader keeps
this cache on, the decompressor must be aware of it (and be able to
determine if the cache is on to decide if it needs to flush or
invalidate).

Best regards
Uwe
diff mbox

Patch

diff --git a/Documentation/arm/Booting b/Documentation/arm/Booting
index 371814a36719..c4c423164d5e 100644
--- a/Documentation/arm/Booting
+++ b/Documentation/arm/Booting
@@ -193,6 +193,7 @@  In any case, the following conditions must be met:
   The MMU must be off.
   Instruction cache may be on or off.
   Data cache must be off.
+  Data cache should be invalidated.
 
   If the kernel is entered in HYP mode, the above requirements apply to
   the HYP mode configuration in addition to the ordinary PL1 (privileged