diff mbox

[v10,2/8] ARM: l2c: Refactor the driver to use commit-like interface

Message ID 1419331716-8972-3-git-send-email-m.szyprowski@samsung.com (mailing list archive)
State New, archived
Headers show

Commit Message

Marek Szyprowski Dec. 23, 2014, 10:48 a.m. UTC
From: Tomasz Figa <t.figa@samsung.com>

Certain implementations of secure hypervisors (namely the one found on
Samsung Exynos-based boards) do not provide access to individual L2C
registers. This makes the .write_sec()-based interface insufficient and
provoking ugly hacks.

This patch is first step to make the driver not rely on availability of
writes to individual registers. This is achieved by refactoring the
driver to use a commit-like operation scheme: all register values are
prepared first and stored in an instance of l2x0_regs struct and then a
single callback is responsible to flush those values to the hardware.

Signed-off-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 arch/arm/mm/cache-l2x0.c | 210 ++++++++++++++++++++++++++---------------------
 1 file changed, 115 insertions(+), 95 deletions(-)

Comments

Tony Lindgren Dec. 23, 2014, 5:06 p.m. UTC | #1
* Marek Szyprowski <m.szyprowski@samsung.com> [141223 02:51]:
> From: Tomasz Figa <t.figa@samsung.com>
> 
> Certain implementations of secure hypervisors (namely the one found on
> Samsung Exynos-based boards) do not provide access to individual L2C
> registers. This makes the .write_sec()-based interface insufficient and
> provoking ugly hacks.
> 
> This patch is first step to make the driver not rely on availability of
> writes to individual registers. This is achieved by refactoring the
> driver to use a commit-like operation scheme: all register values are
> prepared first and stored in an instance of l2x0_regs struct and then a
> single callback is responsible to flush those values to the hardware.

The first patch of the series applied things boot with no problem.
But after applying this one I get the following on am437x:

Unhandled fault: imprecise external abort (0xc06) at 0xb6f33884

Probably the same issue Nishanth mentioned.

Regards,

Tony
Nishanth Menon Dec. 23, 2014, 5:13 p.m. UTC | #2
On 12/23/2014 11:06 AM, Tony Lindgren wrote:
> * Marek Szyprowski <m.szyprowski@samsung.com> [141223 02:51]:
>> From: Tomasz Figa <t.figa@samsung.com>
>>
>> Certain implementations of secure hypervisors (namely the one found on
>> Samsung Exynos-based boards) do not provide access to individual L2C
>> registers. This makes the .write_sec()-based interface insufficient and
>> provoking ugly hacks.
>>
>> This patch is first step to make the driver not rely on availability of
>> writes to individual registers. This is achieved by refactoring the
>> driver to use a commit-like operation scheme: all register values are
>> prepared first and stored in an instance of l2x0_regs struct and then a
>> single callback is responsible to flush those values to the hardware.
> 
> The first patch of the series applied things boot with no problem.
> But after applying this one I get the following on am437x:
> 
> Unhandled fault: imprecise external abort (0xc06) at 0xb6f33884
> 
> Probably the same issue Nishanth mentioned.
> 

yep - just finished the bisect... came to the same conclusion..

c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f is the first bad commit
commit c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f
Author: Tomasz Figa <t.figa@samsung.com>
Date:   Tue Dec 23 11:48:30 2014 +0100

    ARM: l2c: Refactor the driver to use commit-like interface

    Certain implementations of secure hypervisors (namely the one found on
    Samsung Exynos-based boards) do not provide access to individual L2C
    registers. This makes the .write_sec()-based interface
insufficient and
    provoking ugly hacks.

    This patch is first step to make the driver not rely on
availability of
    writes to individual registers. This is achieved by refactoring the
    driver to use a commit-like operation scheme: all register values are
    prepared first and stored in an instance of l2x0_regs struct and
then a
    single callback is responsible to flush those values to the hardware.

    Signed-off-by: Tomasz Figa <t.figa@samsung.com>
    Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>

:040000 040000 74c6c74a0dc0612d124cd759951adf2a1e4124ee
8082aabb474f8659231de744d87cd8dbd6dd79bb M	arch


$ git bisect log
git bisect start
# good: [97bf6af1f928216fd6c5a66e8a57bfa95a659672] Linux 3.19-rc1
git bisect good 97bf6af1f928216fd6c5a66e8a57bfa95a659672
# bad: [9afe195db6558621bd8bac379ed65ef121930684] ARM: dts: exynos4:
Add nodes for L2 cache controller
git bisect bad 9afe195db6558621bd8bac379ed65ef121930684
# bad: [0a89ef4dd870bbf692e30fef6c8182d7b8b42e17] ARM: l2c: Get outer
cache .write_sec callback from mach_desc only if not NULL
git bisect bad 0a89ef4dd870bbf692e30fef6c8182d7b8b42e17
# bad: [c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f] ARM: l2c: Refactor
the driver to use commit-like interface
git bisect bad c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f
# good: [080ab387c653b8655dc1ee790658b618399db2aa] ARM: OMAP2+: use
common l2cache initialization code
git bisect good 080ab387c653b8655dc1ee790658b618399db2aa
Tomasz Figa Dec. 28, 2014, 11:34 a.m. UTC | #3
Nishanth, Tony,

On 24.12.2014 02:13, Nishanth Menon wrote:
> On 12/23/2014 11:06 AM, Tony Lindgren wrote:
>> * Marek Szyprowski <m.szyprowski@samsung.com> [141223 02:51]:
>>> From: Tomasz Figa <t.figa@samsung.com>
>>>
>>> Certain implementations of secure hypervisors (namely the one found on
>>> Samsung Exynos-based boards) do not provide access to individual L2C
>>> registers. This makes the .write_sec()-based interface insufficient and
>>> provoking ugly hacks.
>>>
>>> This patch is first step to make the driver not rely on availability of
>>> writes to individual registers. This is achieved by refactoring the
>>> driver to use a commit-like operation scheme: all register values are
>>> prepared first and stored in an instance of l2x0_regs struct and then a
>>> single callback is responsible to flush those values to the hardware.
>>
>> The first patch of the series applied things boot with no problem.
>> But after applying this one I get the following on am437x:
>>
>> Unhandled fault: imprecise external abort (0xc06) at 0xb6f33884
>>
>> Probably the same issue Nishanth mentioned.
>>
>
> yep - just finished the bisect... came to the same conclusion..
>
> c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f is the first bad commit
> commit c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f
> Author: Tomasz Figa <t.figa@samsung.com>
> Date:   Tue Dec 23 11:48:30 2014 +0100
>
>      ARM: l2c: Refactor the driver to use commit-like interface
>
>      Certain implementations of secure hypervisors (namely the one found on
>      Samsung Exynos-based boards) do not provide access to individual L2C
>      registers. This makes the .write_sec()-based interface
> insufficient and
>      provoking ugly hacks.
>
>      This patch is first step to make the driver not rely on
> availability of
>      writes to individual registers. This is achieved by refactoring the
>      driver to use a commit-like operation scheme: all register values are
>      prepared first and stored in an instance of l2x0_regs struct and
> then a
>      single callback is responsible to flush those values to the hardware.
>
>      Signed-off-by: Tomasz Figa <t.figa@samsung.com>
>      Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>
> :040000 040000 74c6c74a0dc0612d124cd759951adf2a1e4124ee
> 8082aabb474f8659231de744d87cd8dbd6dd79bb M	arch
>
>
> $ git bisect log
> git bisect start
> # good: [97bf6af1f928216fd6c5a66e8a57bfa95a659672] Linux 3.19-rc1
> git bisect good 97bf6af1f928216fd6c5a66e8a57bfa95a659672
> # bad: [9afe195db6558621bd8bac379ed65ef121930684] ARM: dts: exynos4:
> Add nodes for L2 cache controller
> git bisect bad 9afe195db6558621bd8bac379ed65ef121930684
> # bad: [0a89ef4dd870bbf692e30fef6c8182d7b8b42e17] ARM: l2c: Get outer
> cache .write_sec callback from mach_desc only if not NULL
> git bisect bad 0a89ef4dd870bbf692e30fef6c8182d7b8b42e17
> # bad: [c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f] ARM: l2c: Refactor
> the driver to use commit-like interface
> git bisect bad c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f
> # good: [080ab387c653b8655dc1ee790658b618399db2aa] ARM: OMAP2+: use
> common l2cache initialization code
> git bisect good 080ab387c653b8655dc1ee790658b618399db2aa
>
>

May I ask you (or anyone else working on OMAP) to try to figure out what 
the issue is? It is stopping L2 cache support for Exynos4 being merged 
and Exynos people don't have access to any of affected boards to do 
anything about it. After all, this is generic code, so I believe 
community should cooperate with pushing it forward. (Of course I 
understand it is a holiday season at the moment, so I don't expect any 
solution right at this moment :))

Apparently patch 1/8 solved problems with some of the boards. Could you 
check how those boards differ and look for potential causes?

Best regards,
Tomasz
Nishanth Menon Dec. 29, 2014, 2:29 p.m. UTC | #4
On 20:34-20141228, Tomasz Figa wrote:
> May I ask you (or anyone else working on OMAP) to try to figure out
> what the issue is? It is stopping L2 cache support for Exynos4 being

http://slexy.org/view/s2BnzxVglj
Took a register dump and compared -> Got the same dump
http://slexy.org/view/s21YRHpzeW .

Basically, this implies: possible sequence change broke AM437x or
some secure function call (which i have'nt dumped).

Is it possible to break up this patch into easier patches?
Nishanth Menon Dec. 29, 2014, 6:23 p.m. UTC | #5
On 12/23/2014 04:48 AM, Marek Szyprowski wrote:

> -static void l2c310_resume(void)
> +static void l2c310_configure(void __iomem *base)
>  {
> -	void __iomem *base = l2x0_base;
> +	unsigned revision;
>  
> -	if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
> -		unsigned revision;
> -
> -		/* restore pl310 setup */
> -		writel_relaxed(l2x0_saved_regs.tag_latency,
> -			       base + L310_TAG_LATENCY_CTRL);
> -		writel_relaxed(l2x0_saved_regs.data_latency,
> -			       base + L310_DATA_LATENCY_CTRL);
> -		writel_relaxed(l2x0_saved_regs.filter_end,
> -			       base + L310_ADDR_FILTER_END);
> -		writel_relaxed(l2x0_saved_regs.filter_start,
> -			       base + L310_ADDR_FILTER_START);
> -
> -		revision = readl_relaxed(base + L2X0_CACHE_ID) &
> -				L2X0_CACHE_ID_RTL_MASK;
> -
> -		if (revision >= L310_CACHE_ID_RTL_R2P0)
> -			l2c_write_sec(l2x0_saved_regs.prefetch_ctrl, base,
> -				      L310_PREFETCH_CTRL);
> -		if (revision >= L310_CACHE_ID_RTL_R3P0)
> -			l2c_write_sec(l2x0_saved_regs.pwr_ctrl, base,
> -				      L310_POWER_CTRL);
> -
> -		l2c_enable(base, l2x0_saved_regs.aux_ctrl, 8);
> -
> -		/* Re-enable full-line-of-zeros for Cortex-A9 */
> -		if (l2x0_saved_regs.aux_ctrl & L310_AUX_CTRL_FULL_LINE_ZERO)
> -			set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
> -	}
> +	/* restore pl310 setup */
> +	writel_relaxed(l2x0_saved_regs.tag_latency,
> +		       base + L310_TAG_LATENCY_CTRL);
> +	writel_relaxed(l2x0_saved_regs.data_latency,
> +		       base + L310_DATA_LATENCY_CTRL);
> +	writel_relaxed(l2x0_saved_regs.filter_end,
> +		       base + L310_ADDR_FILTER_END);
> +	writel_relaxed(l2x0_saved_regs.filter_start,
> +		       base + L310_ADDR_FILTER_START);
> +

^^ The above change broke AM437xx. Looks like the change causes the
following behavior difference on AM437x. For some reason, touching any
of the above 4 registers(even with the values read from the same
registers) causes AM437x to go beserk. Comment the 4 writes and we
reach shell. looks like l2c310_resume is not invoked prior to this
series. :(.. now that we reuse that logic to actually do programming,
we start to see the problem.

one option might be to write only those registers that differ from
saved_registers (example: unmodified values dont need reprogramming).

Looks like the following also need addressing:
data->save is called twice (once more after l2cof_init)
l2c310_init_fns also needs l2c310_configure
will be nice to use l2x0_data only after we kmemdup data in __l2c_init

if you'd like to split this up in pieces, [1] might be nice - will go
good to change the pl310, aurora etc in each chunks to enable better
review.

[1]
https://github.com/nmenon/linux-2.6-playground/commits/temp/l2c-patch2-splitup
Tomasz Figa Dec. 30, 2014, 9:05 a.m. UTC | #6
Thanks a lot for investigating this, even before I could look into
splitting this.

2014-12-30 3:23 GMT+09:00 Nishanth Menon <nm@ti.com>:
> On 12/23/2014 04:48 AM, Marek Szyprowski wrote:
>
>> -static void l2c310_resume(void)
>> +static void l2c310_configure(void __iomem *base)
>>  {
>> -     void __iomem *base = l2x0_base;
>> +     unsigned revision;
>>
>> -     if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
>> -             unsigned revision;
>> -
>> -             /* restore pl310 setup */
>> -             writel_relaxed(l2x0_saved_regs.tag_latency,
>> -                            base + L310_TAG_LATENCY_CTRL);
>> -             writel_relaxed(l2x0_saved_regs.data_latency,
>> -                            base + L310_DATA_LATENCY_CTRL);
>> -             writel_relaxed(l2x0_saved_regs.filter_end,
>> -                            base + L310_ADDR_FILTER_END);
>> -             writel_relaxed(l2x0_saved_regs.filter_start,
>> -                            base + L310_ADDR_FILTER_START);
>> -
>> -             revision = readl_relaxed(base + L2X0_CACHE_ID) &
>> -                             L2X0_CACHE_ID_RTL_MASK;
>> -
>> -             if (revision >= L310_CACHE_ID_RTL_R2P0)
>> -                     l2c_write_sec(l2x0_saved_regs.prefetch_ctrl, base,
>> -                                   L310_PREFETCH_CTRL);
>> -             if (revision >= L310_CACHE_ID_RTL_R3P0)
>> -                     l2c_write_sec(l2x0_saved_regs.pwr_ctrl, base,
>> -                                   L310_POWER_CTRL);
>> -
>> -             l2c_enable(base, l2x0_saved_regs.aux_ctrl, 8);
>> -
>> -             /* Re-enable full-line-of-zeros for Cortex-A9 */
>> -             if (l2x0_saved_regs.aux_ctrl & L310_AUX_CTRL_FULL_LINE_ZERO)
>> -                     set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
>> -     }
>> +     /* restore pl310 setup */
>> +     writel_relaxed(l2x0_saved_regs.tag_latency,
>> +                    base + L310_TAG_LATENCY_CTRL);
>> +     writel_relaxed(l2x0_saved_regs.data_latency,
>> +                    base + L310_DATA_LATENCY_CTRL);
>> +     writel_relaxed(l2x0_saved_regs.filter_end,
>> +                    base + L310_ADDR_FILTER_END);
>> +     writel_relaxed(l2x0_saved_regs.filter_start,
>> +                    base + L310_ADDR_FILTER_START);
>> +
>
> ^^ The above change broke AM437xx. Looks like the change causes the
> following behavior difference on AM437x. For some reason, touching any
> of the above 4 registers(even with the values read from the same
> registers) causes AM437x to go beserk. Comment the 4 writes and we
> reach shell. looks like l2c310_resume is not invoked prior to this
> series. :(.. now that we reuse that logic to actually do programming,
> we start to see the problem.

Hmm, but the thing is that .configure() should not be called if the
controller is already configured, i.e. L2X0_CTRL_EN in L2X0_CTRL is
set. Maybe I missed some check somewhere. Let me reread my code I
wrote quite a long time ago and make sure.

>
> one option might be to write only those registers that differ from
> saved_registers (example: unmodified values dont need reprogramming).
>
> Looks like the following also need addressing:
> data->save is called twice (once more after l2cof_init)
> l2c310_init_fns also needs l2c310_configure
> will be nice to use l2x0_data only after we kmemdup data in __l2c_init

I'll check this.

>
> if you'd like to split this up in pieces, [1] might be nice - will go
> good to change the pl310, aurora etc in each chunks to enable better
> review.

Thanks a lot, the split up version will be definitely useful. Just to
make sure, the parts look quite bisectable, but have you verified that
applying the changes one by one leave the L2 cache working on OMAP?

>
> [1]
> https://github.com/nmenon/linux-2.6-playground/commits/temp/l2c-patch2-splitup

Best regards,
Tomasz
Nishanth Menon Dec. 30, 2014, 2:51 p.m. UTC | #7
On 12/30/2014 03:05 AM, Tomasz Figa wrote:
> Thanks a lot for investigating this, even before I could look into
> splitting this.
> 
> 2014-12-30 3:23 GMT+09:00 Nishanth Menon <nm@ti.com>:
>> On 12/23/2014 04:48 AM, Marek Szyprowski wrote:
>>
>>> -static void l2c310_resume(void)
>>> +static void l2c310_configure(void __iomem *base)
>>>  {
>>> -     void __iomem *base = l2x0_base;
>>> +     unsigned revision;
>>>
>>> -     if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
>>> -             unsigned revision;
>>> -
>>> -             /* restore pl310 setup */
>>> -             writel_relaxed(l2x0_saved_regs.tag_latency,
>>> -                            base + L310_TAG_LATENCY_CTRL);
>>> -             writel_relaxed(l2x0_saved_regs.data_latency,
>>> -                            base + L310_DATA_LATENCY_CTRL);
>>> -             writel_relaxed(l2x0_saved_regs.filter_end,
>>> -                            base + L310_ADDR_FILTER_END);
>>> -             writel_relaxed(l2x0_saved_regs.filter_start,
>>> -                            base + L310_ADDR_FILTER_START);
>>> -
>>> -             revision = readl_relaxed(base + L2X0_CACHE_ID) &
>>> -                             L2X0_CACHE_ID_RTL_MASK;
>>> -
>>> -             if (revision >= L310_CACHE_ID_RTL_R2P0)
>>> -                     l2c_write_sec(l2x0_saved_regs.prefetch_ctrl, base,
>>> -                                   L310_PREFETCH_CTRL);
>>> -             if (revision >= L310_CACHE_ID_RTL_R3P0)
>>> -                     l2c_write_sec(l2x0_saved_regs.pwr_ctrl, base,
>>> -                                   L310_POWER_CTRL);
>>> -
>>> -             l2c_enable(base, l2x0_saved_regs.aux_ctrl, 8);
>>> -
>>> -             /* Re-enable full-line-of-zeros for Cortex-A9 */
>>> -             if (l2x0_saved_regs.aux_ctrl & L310_AUX_CTRL_FULL_LINE_ZERO)
>>> -                     set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
>>> -     }
>>> +     /* restore pl310 setup */
>>> +     writel_relaxed(l2x0_saved_regs.tag_latency,
>>> +                    base + L310_TAG_LATENCY_CTRL);
>>> +     writel_relaxed(l2x0_saved_regs.data_latency,
>>> +                    base + L310_DATA_LATENCY_CTRL);
>>> +     writel_relaxed(l2x0_saved_regs.filter_end,
>>> +                    base + L310_ADDR_FILTER_END);
>>> +     writel_relaxed(l2x0_saved_regs.filter_start,
>>> +                    base + L310_ADDR_FILTER_START);
>>> +
>>
>> ^^ The above change broke AM437xx. Looks like the change causes the
>> following behavior difference on AM437x. For some reason, touching any
>> of the above 4 registers(even with the values read from the same
>> registers) causes AM437x to go beserk. Comment the 4 writes and we
>> reach shell. looks like l2c310_resume is not invoked prior to this
>> series. :(.. now that we reuse that logic to actually do programming,
>> we start to see the problem.
> 
> Hmm, but the thing is that .configure() should not be called if the
> controller is already configured, i.e. L2X0_CTRL_EN in L2X0_CTRL is
> set. Maybe I missed some check somewhere. Let me reread my code I
> wrote quite a long time ago and make sure.

you have'nt missed a check here. it does indeed get called
l2c310_enable->l2c_enable -> (if cache
disabled)->l2c_configure->l2c310_configure

It looks like a quirk of AM437x which remained hidden till this patch
exposed it. The original pl310 resume would have been invoked if
outer_resume was invoked (if my reading of code is correct), which in
the case of AM437x was never invoked during boot. By reusing the
restore code for resume (which in my opinion is a good change), we
seemed to have exposed quirky behavior on am437x. I started a thread
yesterday with hardware folks trying to understand the integration
aspects and explanation for this quirky behavior - unfortunately, with
holidays, I doubt I might get a quick answer fast. but the workaround
seems obvious - do not write to the the mentioned 4 registers on
am437x. This might make sense it arm,tag-latency , arm,data-latency,
arm,filter-ranges properties are not defined in OF.

>> one option might be to write only those registers that differ from
>> saved_registers (example: unmodified values dont need reprogramming).
>>
>> Looks like the following also need addressing:
>> data->save is called twice (once more after l2cof_init)
>> l2c310_init_fns also needs l2c310_configure
>> will be nice to use l2x0_data only after we kmemdup data in __l2c_init
> 
> I'll check this.
Thanks.

> 
>>
>> if you'd like to split this up in pieces, [1] might be nice - will go
>> good to change the pl310, aurora etc in each chunks to enable better
>> review.
> 
> Thanks a lot, the split up version will be definitely useful. Just to
> make sure, the parts look quite bisectable, but have you verified that
> applying the changes one by one leave the L2 cache working on OMAP?

The split is not complete or properly done as I ignored aurora and was
trying to do it focussed on pl310 impacted code in a hurry, but yes,
the split is indeed bisectable for OMAP platforms. The intent was to
indicate the direction we should probably take and introduce the
refactor in stages. At the very least, it tends to be easier to debug
down to.

>> [1]
>> https://github.com/nmenon/linux-2.6-playground/commits/temp/l2c-patch2-splitup
> 
> Best regards,
> Tomasz
>
Tomasz Figa Jan. 2, 2015, 8:55 a.m. UTC | #8
On 30.12.2014 03:23, Nishanth Menon wrote:
> On 12/23/2014 04:48 AM, Marek Szyprowski wrote:
>
>> -static void l2c310_resume(void)
>> +static void l2c310_configure(void __iomem *base)
>>   {
>> -	void __iomem *base = l2x0_base;
>> +	unsigned revision;
>>
>> -	if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
>> -		unsigned revision;
>> -
>> -		/* restore pl310 setup */
>> -		writel_relaxed(l2x0_saved_regs.tag_latency,
>> -			       base + L310_TAG_LATENCY_CTRL);
>> -		writel_relaxed(l2x0_saved_regs.data_latency,
>> -			       base + L310_DATA_LATENCY_CTRL);
>> -		writel_relaxed(l2x0_saved_regs.filter_end,
>> -			       base + L310_ADDR_FILTER_END);
>> -		writel_relaxed(l2x0_saved_regs.filter_start,
>> -			       base + L310_ADDR_FILTER_START);
>> -
>> -		revision = readl_relaxed(base + L2X0_CACHE_ID) &
>> -				L2X0_CACHE_ID_RTL_MASK;
>> -
>> -		if (revision >= L310_CACHE_ID_RTL_R2P0)
>> -			l2c_write_sec(l2x0_saved_regs.prefetch_ctrl, base,
>> -				      L310_PREFETCH_CTRL);
>> -		if (revision >= L310_CACHE_ID_RTL_R3P0)
>> -			l2c_write_sec(l2x0_saved_regs.pwr_ctrl, base,
>> -				      L310_POWER_CTRL);
>> -
>> -		l2c_enable(base, l2x0_saved_regs.aux_ctrl, 8);
>> -
>> -		/* Re-enable full-line-of-zeros for Cortex-A9 */
>> -		if (l2x0_saved_regs.aux_ctrl & L310_AUX_CTRL_FULL_LINE_ZERO)
>> -			set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
>> -	}
>> +	/* restore pl310 setup */
>> +	writel_relaxed(l2x0_saved_regs.tag_latency,
>> +		       base + L310_TAG_LATENCY_CTRL);
>> +	writel_relaxed(l2x0_saved_regs.data_latency,
>> +		       base + L310_DATA_LATENCY_CTRL);
>> +	writel_relaxed(l2x0_saved_regs.filter_end,
>> +		       base + L310_ADDR_FILTER_END);
>> +	writel_relaxed(l2x0_saved_regs.filter_start,
>> +		       base + L310_ADDR_FILTER_START);
>> +
>
> ^^ The above change broke AM437xx. Looks like the change causes the
> following behavior difference on AM437x. For some reason, touching any
> of the above 4 registers(even with the values read from the same
> registers) causes AM437x to go beserk. Comment the 4 writes and we
> reach shell. looks like l2c310_resume is not invoked prior to this
> series. :(.. now that we reuse that logic to actually do programming,
> we start to see the problem.

OK, I probably have answer for this. Apparently all four register above 
cannot be written in non-secure mode and they should go through 
l2c_write_sec(). More on this can be found in CoreLink Level 2 Cache 
Controller L2C-310 Technical Reference Manual, 3.2. Register summary, 
table 3.1. I have checked the TRM for r3p3, but I guess this should be 
uniform for all revisions.

Why this worked before? The registers were not written unless respective 
properties in DT were present and OMAP do not have them in DT. Current 
code always writes them, which should not really matter if the code is 
correct. (But it isn't - writel_relaxed() can't be used directly for 
those registers.)

Could you check if replacing those four writel_relaxed() with 
l2c_write_sec() does the thing?

Best regards,
Tomasz
Tomasz Figa Jan. 2, 2015, 9:13 a.m. UTC | #9
On 30.12.2014 23:51, Nishanth Menon wrote:
>>> Looks like the following also need addressing:
>>> data->save is called twice (once more after l2cof_init)
>>> l2c310_init_fns also needs l2c310_configure
>>> will be nice to use l2x0_data only after we kmemdup data in __l2c_init
>>
>> I'll check this.
> Thanks.
>

Apparently the second save in __l2c_init() is not needed and it should 
have been removed. However it might be a good idea to actually do second 
save in l2c_enable() after l2c_configure() so that the values actually 
permitted by hardware and/or secure firmware are stored.

l2c310_init_fns needs to be updated indeed.

However I'm not sure about your concern about using l2x0_data before 
kmemdup(). I don't see any code potentially doing this.

Best regards,
Tomasz
Tomasz Figa Jan. 2, 2015, 9:28 a.m. UTC | #10
On 02.01.2015 18:13, Tomasz Figa wrote:
> On 30.12.2014 23:51, Nishanth Menon wrote:
>>>> Looks like the following also need addressing:
>>>> data->save is called twice (once more after l2cof_init)
>>>> l2c310_init_fns also needs l2c310_configure
>>>> will be nice to use l2x0_data only after we kmemdup data in __l2c_init
>>>
>>> I'll check this.
>> Thanks.
>>
>
> Apparently the second save in __l2c_init() is not needed and it should
> have been removed. However it might be a good idea to actually do second
> save in l2c_enable() after l2c_configure() so that the values actually
> permitted by hardware and/or secure firmware are stored.
>
> l2c310_init_fns needs to be updated indeed.

Hmm, apparently current patch already adds this (and I missed it reading 
it at first), so I'm not sure what's your concern about it.

Best regards,
Tomasz
Nishanth Menon Jan. 2, 2015, 3:36 p.m. UTC | #11
On 01/02/2015 03:28 AM, Tomasz Figa wrote:
> 
> 
> On 02.01.2015 18:13, Tomasz Figa wrote:
>> On 30.12.2014 23:51, Nishanth Menon wrote:
>>>>> Looks like the following also need addressing:
>>>>> data->save is called twice (once more after l2cof_init)
>>>>> l2c310_init_fns also needs l2c310_configure
>>>>> will be nice to use l2x0_data only after we kmemdup data in __l2c_init
>>>>
>>>> I'll check this.
>>> Thanks.
>>>
>>
>> Apparently the second save in __l2c_init() is not needed and it should
>> have been removed. However it might be a good idea to actually do second
>> save in l2c_enable() after l2c_configure() so that the values actually
>> permitted by hardware and/or secure firmware are stored.
>>
>> l2c310_init_fns needs to be updated indeed.
> 
> Hmm, apparently current patch already adds this (and I missed it reading 
> it at first), so I'm not sure what's your concern about it.

Uggh.. looks like I missed the same as well :( Sorry about that..
Nishanth Menon Jan. 2, 2015, 3:38 p.m. UTC | #12
On 01/02/2015 03:13 AM, Tomasz Figa wrote:

> However I'm not sure about your concern about using l2x0_data before 
> kmemdup(). I don't see any code potentially doing this.

It is not terribly important, but anyways [1] is what I had in mind..



[1]
https://github.com/nmenon/linux-2.6-playground/commit/40f65e4707856f7b35872cf2225bf8beaf43552c#diff-56925866b8b499d75763bf5a1d7f1666L883
Nishanth Menon Jan. 2, 2015, 5:57 p.m. UTC | #13
On 01/02/2015 02:55 AM, Tomasz Figa wrote:
> On 30.12.2014 03:23, Nishanth Menon wrote:
>> On 12/23/2014 04:48 AM, Marek Szyprowski wrote:
>>
>>> -static void l2c310_resume(void)
>>> +static void l2c310_configure(void __iomem *base)
>>>   {
>>> -	void __iomem *base = l2x0_base;
>>> +	unsigned revision;
>>>
>>> -	if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
>>> -		unsigned revision;
>>> -
>>> -		/* restore pl310 setup */
>>> -		writel_relaxed(l2x0_saved_regs.tag_latency,
>>> -			       base + L310_TAG_LATENCY_CTRL);
>>> -		writel_relaxed(l2x0_saved_regs.data_latency,
>>> -			       base + L310_DATA_LATENCY_CTRL);
>>> -		writel_relaxed(l2x0_saved_regs.filter_end,
>>> -			       base + L310_ADDR_FILTER_END);
>>> -		writel_relaxed(l2x0_saved_regs.filter_start,
>>> -			       base + L310_ADDR_FILTER_START);
>>> -
>>> -		revision = readl_relaxed(base + L2X0_CACHE_ID) &
>>> -				L2X0_CACHE_ID_RTL_MASK;
>>> -
>>> -		if (revision >= L310_CACHE_ID_RTL_R2P0)
>>> -			l2c_write_sec(l2x0_saved_regs.prefetch_ctrl, base,
>>> -				      L310_PREFETCH_CTRL);
>>> -		if (revision >= L310_CACHE_ID_RTL_R3P0)
>>> -			l2c_write_sec(l2x0_saved_regs.pwr_ctrl, base,
>>> -				      L310_POWER_CTRL);
>>> -
>>> -		l2c_enable(base, l2x0_saved_regs.aux_ctrl, 8);
>>> -
>>> -		/* Re-enable full-line-of-zeros for Cortex-A9 */
>>> -		if (l2x0_saved_regs.aux_ctrl & L310_AUX_CTRL_FULL_LINE_ZERO)
>>> -			set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
>>> -	}
>>> +	/* restore pl310 setup */
>>> +	writel_relaxed(l2x0_saved_regs.tag_latency,
>>> +		       base + L310_TAG_LATENCY_CTRL);
>>> +	writel_relaxed(l2x0_saved_regs.data_latency,
>>> +		       base + L310_DATA_LATENCY_CTRL);
>>> +	writel_relaxed(l2x0_saved_regs.filter_end,
>>> +		       base + L310_ADDR_FILTER_END);
>>> +	writel_relaxed(l2x0_saved_regs.filter_start,
>>> +		       base + L310_ADDR_FILTER_START);
>>> +
>>
>> ^^ The above change broke AM437xx. Looks like the change causes the
>> following behavior difference on AM437x. For some reason, touching any
>> of the above 4 registers(even with the values read from the same
>> registers) causes AM437x to go beserk. Comment the 4 writes and we
>> reach shell. looks like l2c310_resume is not invoked prior to this
>> series. :(.. now that we reuse that logic to actually do programming,
>> we start to see the problem.
> 
> OK, I probably have answer for this. Apparently all four register above 
> cannot be written in non-secure mode and they should go through 
> l2c_write_sec(). More on this can be found in CoreLink Level 2 Cache 
> Controller L2C-310 Technical Reference Manual, 3.2. Register summary, 
> table 3.1. I have checked the TRM for r3p3, but I guess this should be 
> uniform for all revisions.

Yep, you seemed to have caught the issue correctly.

> 
> Why this worked before? The registers were not written unless respective 
> properties in DT were present and OMAP do not have them in DT. Current 
> code always writes them, which should not really matter if the code is 
> correct. (But it isn't - writel_relaxed() can't be used directly for 
> those registers.)
> 
> Could you check if replacing those four writel_relaxed() with 
> l2c_write_sec() does the thing?

Considering that
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arm/mach-omap2/omap4-common.c#n169
has no implementation of DATA,TAG_LATENCY, FILTER_*
I did a few patches for those - separate from this series and posted
(v2 of the series):
https://patchwork.kernel.org/patch/5560231/
https://patchwork.kernel.org/patch/5560211/

Anyways, l2c_write_sec will refuse to write if the read value is same
as requested value - that is exactly what we want here. So, for
testing, I hacked it and remove the check to force the writes
(http://slexy.org/view/s2p8c3gl32)

The replaced raw_writels with secure writes(fix needed for this
patch): http://slexy.org/view/s21PbM73tt

(with secure writes, my patches and force write hack applied):
 1: am437x-sk: BOOT: PASS: http://slexy.org/raw/s2BxqX5NOz
 2: am43xx-epos: BOOT: PASS: http://slexy.org/raw/s2VkKYuYed
 3: am43xx-gpevm: BOOT: PASS: http://slexy.org/raw/s2kO95WSuY
 4: pandaboard-es: BOOT: PASS: http://slexy.org/raw/s21sx5gmUl
 5: pandaboard-vanilla: BOOT: PASS: http://slexy.org/raw/s21lpUJK6r
 6: sdp4430: BOOT: PASS: http://slexy.org/raw/s21UVKHle5


(with just secure writes and none of my patches or hacks applied):
 1: am437x-sk: BOOT: PASS: http://slexy.org/raw/s21UxoDdo5
 2: am43xx-epos: BOOT: PASS: http://slexy.org/raw/s2ECrLZ1FH
 3: am43xx-gpevm: BOOT: PASS: http://slexy.org/raw/s2CowTVA9P
 4: pandaboard-es: BOOT: PASS: http://slexy.org/raw/s205MVIWf0
 5: pandaboard-vanilla: BOOT: PASS: http://slexy.org/raw/s29oJ66Rqs
 6: sdp4430: BOOT: PASS: http://slexy.org/raw/s20jLzW2cX


Awesome :). Thanks for catching this.
diff mbox

Patch

diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
index 5e65ca8dea62..e5948c5adaa7 100644
--- a/arch/arm/mm/cache-l2x0.c
+++ b/arch/arm/mm/cache-l2x0.c
@@ -41,12 +41,14 @@  struct l2c_init_data {
 	void (*enable)(void __iomem *, u32, unsigned);
 	void (*fixup)(void __iomem *, u32, struct outer_cache_fns *);
 	void (*save)(void __iomem *);
+	void (*configure)(void __iomem *);
 	struct outer_cache_fns outer_cache;
 };
 
 #define CACHE_LINE_SIZE		32
 
 static void __iomem *l2x0_base;
+static const struct l2c_init_data *l2x0_data;
 static DEFINE_RAW_SPINLOCK(l2x0_lock);
 static u32 l2x0_way_mask;	/* Bitmask of active ways */
 static u32 l2x0_size;
@@ -106,6 +108,14 @@  static inline void l2c_unlock(void __iomem *base, unsigned num)
 	}
 }
 
+static void l2c_configure(void __iomem *base)
+{
+	if (l2x0_data->configure)
+		l2x0_data->configure(base);
+
+	l2c_write_sec(l2x0_saved_regs.aux_ctrl, base, L2X0_AUX_CTRL);
+}
+
 /*
  * Enable the L2 cache controller.  This function must only be
  * called when the cache controller is known to be disabled.
@@ -114,7 +124,12 @@  static void l2c_enable(void __iomem *base, u32 aux, unsigned num_lock)
 {
 	unsigned long flags;
 
-	l2c_write_sec(aux, base, L2X0_AUX_CTRL);
+	/* Do not touch the controller if already enabled. */
+	if (readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)
+		return;
+
+	l2x0_saved_regs.aux_ctrl = aux;
+	l2c_configure(base);
 
 	l2c_unlock(base, num_lock);
 
@@ -208,6 +223,11 @@  static void l2c_save(void __iomem *base)
 	l2x0_saved_regs.aux_ctrl = readl_relaxed(l2x0_base + L2X0_AUX_CTRL);
 }
 
+static void l2c_resume(void)
+{
+	l2c_enable(l2x0_base, l2x0_saved_regs.aux_ctrl, l2x0_data->num_lock);
+}
+
 /*
  * L2C-210 specific code.
  *
@@ -288,14 +308,6 @@  static void l2c210_sync(void)
 	__l2c210_cache_sync(l2x0_base);
 }
 
-static void l2c210_resume(void)
-{
-	void __iomem *base = l2x0_base;
-
-	if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN))
-		l2c_enable(base, l2x0_saved_regs.aux_ctrl, 1);
-}
-
 static const struct l2c_init_data l2c210_data __initconst = {
 	.type = "L2C-210",
 	.way_size_0 = SZ_8K,
@@ -309,7 +321,7 @@  static const struct l2c_init_data l2c210_data __initconst = {
 		.flush_all = l2c210_flush_all,
 		.disable = l2c_disable,
 		.sync = l2c210_sync,
-		.resume = l2c210_resume,
+		.resume = l2c_resume,
 	},
 };
 
@@ -466,7 +478,7 @@  static const struct l2c_init_data l2c220_data = {
 		.flush_all = l2c220_flush_all,
 		.disable = l2c_disable,
 		.sync = l2c220_sync,
-		.resume = l2c210_resume,
+		.resume = l2c_resume,
 	},
 };
 
@@ -615,39 +627,29 @@  static void __init l2c310_save(void __iomem *base)
 							L310_POWER_CTRL);
 }
 
-static void l2c310_resume(void)
+static void l2c310_configure(void __iomem *base)
 {
-	void __iomem *base = l2x0_base;
+	unsigned revision;
 
-	if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
-		unsigned revision;
-
-		/* restore pl310 setup */
-		writel_relaxed(l2x0_saved_regs.tag_latency,
-			       base + L310_TAG_LATENCY_CTRL);
-		writel_relaxed(l2x0_saved_regs.data_latency,
-			       base + L310_DATA_LATENCY_CTRL);
-		writel_relaxed(l2x0_saved_regs.filter_end,
-			       base + L310_ADDR_FILTER_END);
-		writel_relaxed(l2x0_saved_regs.filter_start,
-			       base + L310_ADDR_FILTER_START);
-
-		revision = readl_relaxed(base + L2X0_CACHE_ID) &
-				L2X0_CACHE_ID_RTL_MASK;
-
-		if (revision >= L310_CACHE_ID_RTL_R2P0)
-			l2c_write_sec(l2x0_saved_regs.prefetch_ctrl, base,
-				      L310_PREFETCH_CTRL);
-		if (revision >= L310_CACHE_ID_RTL_R3P0)
-			l2c_write_sec(l2x0_saved_regs.pwr_ctrl, base,
-				      L310_POWER_CTRL);
-
-		l2c_enable(base, l2x0_saved_regs.aux_ctrl, 8);
-
-		/* Re-enable full-line-of-zeros for Cortex-A9 */
-		if (l2x0_saved_regs.aux_ctrl & L310_AUX_CTRL_FULL_LINE_ZERO)
-			set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
-	}
+	/* restore pl310 setup */
+	writel_relaxed(l2x0_saved_regs.tag_latency,
+		       base + L310_TAG_LATENCY_CTRL);
+	writel_relaxed(l2x0_saved_regs.data_latency,
+		       base + L310_DATA_LATENCY_CTRL);
+	writel_relaxed(l2x0_saved_regs.filter_end,
+		       base + L310_ADDR_FILTER_END);
+	writel_relaxed(l2x0_saved_regs.filter_start,
+		       base + L310_ADDR_FILTER_START);
+
+	revision = readl_relaxed(base + L2X0_CACHE_ID) &
+				 L2X0_CACHE_ID_RTL_MASK;
+
+	if (revision >= L310_CACHE_ID_RTL_R2P0)
+		l2c_write_sec(l2x0_saved_regs.prefetch_ctrl, base,
+			      L310_PREFETCH_CTRL);
+	if (revision >= L310_CACHE_ID_RTL_R3P0)
+		l2c_write_sec(l2x0_saved_regs.pwr_ctrl, base,
+			      L310_POWER_CTRL);
 }
 
 static int l2c310_cpu_enable_flz(struct notifier_block *nb, unsigned long act, void *data)
@@ -699,6 +701,23 @@  static void __init l2c310_enable(void __iomem *base, u32 aux, unsigned num_lock)
 		aux &= ~(L310_AUX_CTRL_FULL_LINE_ZERO | L310_AUX_CTRL_EARLY_BRESP);
 	}
 
+	/* r3p0 or later has power control register */
+	if (rev >= L310_CACHE_ID_RTL_R3P0)
+		l2x0_saved_regs.pwr_ctrl = L310_DYNAMIC_CLK_GATING_EN |
+						L310_STNDBY_MODE_EN;
+
+	/*
+	 * Always enable non-secure access to the lockdown registers -
+	 * we write to them as part of the L2C enable sequence so they
+	 * need to be accessible.
+	 */
+	aux |= L310_AUX_CTRL_NS_LOCKDOWN;
+
+	l2c_enable(base, aux, num_lock);
+
+	/* Read back resulting AUX_CTRL value as it could have been altered. */
+	aux = readl_relaxed(base + L2X0_AUX_CTRL);
+
 	if (aux & (L310_AUX_CTRL_DATA_PREFETCH | L310_AUX_CTRL_INSTR_PREFETCH)) {
 		u32 prefetch = readl_relaxed(base + L310_PREFETCH_CTRL);
 
@@ -712,23 +731,12 @@  static void __init l2c310_enable(void __iomem *base, u32 aux, unsigned num_lock)
 	if (rev >= L310_CACHE_ID_RTL_R3P0) {
 		u32 power_ctrl;
 
-		l2c_write_sec(L310_DYNAMIC_CLK_GATING_EN | L310_STNDBY_MODE_EN,
-			      base, L310_POWER_CTRL);
 		power_ctrl = readl_relaxed(base + L310_POWER_CTRL);
 		pr_info("L2C-310 dynamic clock gating %sabled, standby mode %sabled\n",
 			power_ctrl & L310_DYNAMIC_CLK_GATING_EN ? "en" : "dis",
 			power_ctrl & L310_STNDBY_MODE_EN ? "en" : "dis");
 	}
 
-	/*
-	 * Always enable non-secure access to the lockdown registers -
-	 * we write to them as part of the L2C enable sequence so they
-	 * need to be accessible.
-	 */
-	aux |= L310_AUX_CTRL_NS_LOCKDOWN;
-
-	l2c_enable(base, aux, num_lock);
-
 	if (aux & L310_AUX_CTRL_FULL_LINE_ZERO) {
 		set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
 		cpu_notifier(l2c310_cpu_enable_flz, 0);
@@ -760,11 +768,11 @@  static void __init l2c310_fixup(void __iomem *base, u32 cache_id,
 
 	if (revision >= L310_CACHE_ID_RTL_R3P0 &&
 	    revision < L310_CACHE_ID_RTL_R3P2) {
-		u32 val = readl_relaxed(base + L310_PREFETCH_CTRL);
+		u32 val = l2x0_saved_regs.prefetch_ctrl;
 		/* I don't think bit23 is required here... but iMX6 does so */
 		if (val & (BIT(30) | BIT(23))) {
 			val &= ~(BIT(30) | BIT(23));
-			l2c_write_sec(val, base, L310_PREFETCH_CTRL);
+			l2x0_saved_regs.prefetch_ctrl = val;
 			errata[n++] = "752271";
 		}
 	}
@@ -800,6 +808,15 @@  static void l2c310_disable(void)
 	l2c_disable();
 }
 
+static void l2c310_resume(void)
+{
+	l2c_resume();
+
+	/* Re-enable full-line-of-zeros for Cortex-A9 */
+	if (l2x0_saved_regs.aux_ctrl & L310_AUX_CTRL_FULL_LINE_ZERO)
+		set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
+}
+
 static const struct l2c_init_data l2c310_init_fns __initconst = {
 	.type = "L2C-310",
 	.way_size_0 = SZ_8K,
@@ -807,6 +824,7 @@  static const struct l2c_init_data l2c310_init_fns __initconst = {
 	.enable = l2c310_enable,
 	.fixup = l2c310_fixup,
 	.save = l2c310_save,
+	.configure = l2c310_configure,
 	.outer_cache = {
 		.inv_range = l2c210_inv_range,
 		.clean_range = l2c210_clean_range,
@@ -818,7 +836,7 @@  static const struct l2c_init_data l2c310_init_fns __initconst = {
 	},
 };
 
-static void __init __l2c_init(const struct l2c_init_data *data,
+static int __init __l2c_init(const struct l2c_init_data *data,
 	u32 aux_val, u32 aux_mask, u32 cache_id)
 {
 	struct outer_cache_fns fns;
@@ -826,6 +844,14 @@  static void __init __l2c_init(const struct l2c_init_data *data,
 	u32 aux, old_aux;
 
 	/*
+	 * Save the pointer globally so that callbacks which do not receive
+	 * context from callers can access the structure.
+	 */
+	l2x0_data = kmemdup(data, sizeof(*data), GFP_KERNEL);
+	if (!l2x0_data)
+		return -ENOMEM;
+
+	/*
 	 * Sanity check the aux values.  aux_mask is the bits we preserve
 	 * from reading the hardware register, and aux_val is the bits we
 	 * set.
@@ -910,6 +936,8 @@  static void __init __l2c_init(const struct l2c_init_data *data,
 		data->type, ways, l2x0_size >> 10);
 	pr_info("%s: CACHE_ID 0x%08x, AUX_CTRL 0x%08x\n",
 		data->type, cache_id, aux);
+
+	return 0;
 }
 
 void __init l2x0_init(void __iomem *base, u32 aux_val, u32 aux_mask)
@@ -936,6 +964,10 @@  void __init l2x0_init(void __iomem *base, u32 aux_val, u32 aux_mask)
 		break;
 	}
 
+	/* Read back current (default) hardware configuration */
+	if (data->save)
+		data->save(l2x0_base);
+
 	__l2c_init(data, aux_val, aux_mask, cache_id);
 }
 
@@ -1102,7 +1134,7 @@  static const struct l2c_init_data of_l2c210_data __initconst = {
 		.flush_all   = l2c210_flush_all,
 		.disable     = l2c_disable,
 		.sync        = l2c210_sync,
-		.resume      = l2c210_resume,
+		.resume      = l2c_resume,
 	},
 };
 
@@ -1120,7 +1152,7 @@  static const struct l2c_init_data of_l2c220_data __initconst = {
 		.flush_all   = l2c220_flush_all,
 		.disable     = l2c_disable,
 		.sync        = l2c220_sync,
-		.resume      = l2c210_resume,
+		.resume      = l2c_resume,
 	},
 };
 
@@ -1135,28 +1167,26 @@  static void __init l2c310_of_parse(const struct device_node *np,
 
 	of_property_read_u32_array(np, "arm,tag-latency", tag, ARRAY_SIZE(tag));
 	if (tag[0] && tag[1] && tag[2])
-		writel_relaxed(
+		l2x0_saved_regs.tag_latency =
 			L310_LATENCY_CTRL_RD(tag[0] - 1) |
 			L310_LATENCY_CTRL_WR(tag[1] - 1) |
-			L310_LATENCY_CTRL_SETUP(tag[2] - 1),
-			l2x0_base + L310_TAG_LATENCY_CTRL);
+			L310_LATENCY_CTRL_SETUP(tag[2] - 1);
 
 	of_property_read_u32_array(np, "arm,data-latency",
 				   data, ARRAY_SIZE(data));
 	if (data[0] && data[1] && data[2])
-		writel_relaxed(
+		l2x0_saved_regs.data_latency =
 			L310_LATENCY_CTRL_RD(data[0] - 1) |
 			L310_LATENCY_CTRL_WR(data[1] - 1) |
-			L310_LATENCY_CTRL_SETUP(data[2] - 1),
-			l2x0_base + L310_DATA_LATENCY_CTRL);
+			L310_LATENCY_CTRL_SETUP(data[2] - 1);
 
 	of_property_read_u32_array(np, "arm,filter-ranges",
 				   filter, ARRAY_SIZE(filter));
 	if (filter[1]) {
-		writel_relaxed(ALIGN(filter[0] + filter[1], SZ_1M),
-			       l2x0_base + L310_ADDR_FILTER_END);
-		writel_relaxed((filter[0] & ~(SZ_1M - 1)) | L310_ADDR_FILTER_EN,
-			       l2x0_base + L310_ADDR_FILTER_START);
+		l2x0_saved_regs.filter_end =
+					ALIGN(filter[0] + filter[1], SZ_1M);
+		l2x0_saved_regs.filter_start = (filter[0] & ~(SZ_1M - 1))
+					| L310_ADDR_FILTER_EN;
 	}
 
 	ret = l2x0_cache_size_of_parse(np, aux_val, aux_mask, &assoc, SZ_512K);
@@ -1188,6 +1218,7 @@  static const struct l2c_init_data of_l2c310_data __initconst = {
 	.enable = l2c310_enable,
 	.fixup = l2c310_fixup,
 	.save  = l2c310_save,
+	.configure = l2c310_configure,
 	.outer_cache = {
 		.inv_range   = l2c210_inv_range,
 		.clean_range = l2c210_clean_range,
@@ -1216,6 +1247,7 @@  static const struct l2c_init_data of_l2c310_coherent_data __initconst = {
 	.enable = l2c310_enable,
 	.fixup = l2c310_fixup,
 	.save  = l2c310_save,
+	.configure = l2c310_configure,
 	.outer_cache = {
 		.inv_range   = l2c210_inv_range,
 		.clean_range = l2c210_clean_range,
@@ -1330,16 +1362,6 @@  static void aurora_save(void __iomem *base)
 	l2x0_saved_regs.aux_ctrl = readl_relaxed(base + L2X0_AUX_CTRL);
 }
 
-static void aurora_resume(void)
-{
-	void __iomem *base = l2x0_base;
-
-	if (!(readl(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
-		writel_relaxed(l2x0_saved_regs.aux_ctrl, base + L2X0_AUX_CTRL);
-		writel_relaxed(l2x0_saved_regs.ctrl, base + L2X0_CTRL);
-	}
-}
-
 /*
  * For Aurora cache in no outer mode, enable via the CP15 coprocessor
  * broadcasting of cache commands to L2.
@@ -1401,7 +1423,7 @@  static const struct l2c_init_data of_aurora_with_outer_data __initconst = {
 		.flush_all   = l2x0_flush_all,
 		.disable     = l2x0_disable,
 		.sync        = l2x0_cache_sync,
-		.resume      = aurora_resume,
+		.resume      = l2c_resume,
 	},
 };
 
@@ -1414,7 +1436,7 @@  static const struct l2c_init_data of_aurora_no_outer_data __initconst = {
 	.fixup = aurora_fixup,
 	.save  = aurora_save,
 	.outer_cache = {
-		.resume      = aurora_resume,
+		.resume      = l2c_resume,
 	},
 };
 
@@ -1562,6 +1584,7 @@  static const struct l2c_init_data of_bcm_l2x0_data __initconst = {
 	.of_parse = l2c310_of_parse,
 	.enable = l2c310_enable,
 	.save  = l2c310_save,
+	.configure = l2c310_configure,
 	.outer_cache = {
 		.inv_range   = bcm_inv_range,
 		.clean_range = bcm_clean_range,
@@ -1583,18 +1606,12 @@  static void __init tauros3_save(void __iomem *base)
 		readl_relaxed(base + L310_PREFETCH_CTRL);
 }
 
-static void tauros3_resume(void)
+static void tauros3_configure(void __iomem *base)
 {
-	void __iomem *base = l2x0_base;
-
-	if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
-		writel_relaxed(l2x0_saved_regs.aux2_ctrl,
-			       base + TAUROS3_AUX2_CTRL);
-		writel_relaxed(l2x0_saved_regs.prefetch_ctrl,
-			       base + L310_PREFETCH_CTRL);
-
-		l2c_enable(base, l2x0_saved_regs.aux_ctrl, 8);
-	}
+	writel_relaxed(l2x0_saved_regs.aux2_ctrl,
+		       base + TAUROS3_AUX2_CTRL);
+	writel_relaxed(l2x0_saved_regs.prefetch_ctrl,
+		       base + L310_PREFETCH_CTRL);
 }
 
 static const struct l2c_init_data of_tauros3_data __initconst = {
@@ -1603,9 +1620,10 @@  static const struct l2c_init_data of_tauros3_data __initconst = {
 	.num_lock = 8,
 	.enable = l2c_enable,
 	.save  = tauros3_save,
+	.configure = tauros3_configure,
 	/* Tauros3 broadcasts L1 cache operations to L2 */
 	.outer_cache = {
-		.resume      = tauros3_resume,
+		.resume      = l2c_resume,
 	},
 };
 
@@ -1661,6 +1679,10 @@  int __init l2x0_of_init(u32 aux_val, u32 aux_mask)
 	if (!of_property_read_bool(np, "cache-unified"))
 		pr_err("L2C: device tree omits to specify unified cache\n");
 
+	/* Read back current (default) hardware configuration */
+	if (data->save)
+		data->save(l2x0_base);
+
 	/* L2 configuration can only be changed if the cache is disabled */
 	if (!(readl_relaxed(l2x0_base + L2X0_CTRL) & L2X0_CTRL_EN))
 		if (data->of_parse)
@@ -1671,8 +1693,6 @@  int __init l2x0_of_init(u32 aux_val, u32 aux_mask)
 	else
 		cache_id = readl_relaxed(l2x0_base + L2X0_CACHE_ID);
 
-	__l2c_init(data, aux_val, aux_mask, cache_id);
-
-	return 0;
+	return __l2c_init(data, aux_val, aux_mask, cache_id);
 }
 #endif