diff mbox series

[v8,5/7] cache: Add L2 cache management for Andes AX45MP RISC-V core

Message ID 20230412110900.69738-6-prabhakar.mahadev-lad.rj@bp.renesas.com (mailing list archive)
State Superseded
Delegated to: Geert Uytterhoeven
Headers show
Series Add non-coherent DMA support for AX45MP | expand

Commit Message

Lad, Prabhakar April 12, 2023, 11:08 a.m. UTC
From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

I/O Coherence Port (IOCP) provides an AXI interface for connecting
external non-caching masters, such as DMA controllers. The accesses
from IOCP are coherent with D-Caches and L2 Cache.

IOCP is a specification option and is disabled on the Renesas RZ/Five
SoC due to this reason IP blocks using DMA will fail.

The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
block that allows dynamic adjustment of memory attributes in the runtime.
It contains a configurable amount of PMA entries implemented as CSR
registers to control the attributes of memory locations in interest.
Below are the memory attributes supported:
* Device, Non-bufferable
* Device, bufferable
* Memory, Non-cacheable, Non-bufferable
* Memory, Non-cacheable, Bufferable
* Memory, Write-back, No-allocate
* Memory, Write-back, Read-allocate
* Memory, Write-back, Write-allocate
* Memory, Write-back, Read and Write-allocate

More info about PMA (section 10.3):
Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf

As a workaround for SoCs with IOCP disabled CMO needs to be handled by
software. Firstly OpenSBI configures the memory region as
"Memory, Non-cacheable, Bufferable" and passes this region as a global
shared dma pool as a DT node. With DMA_GLOBAL_POOL enabled all DMA
allocations happen from this region and synchronization callbacks are
implemented to synchronize when doing DMA transactions.

Example PMA region passes as a DT node from OpenSBI:
    reserved-memory {
        #address-cells = <2>;
        #size-cells = <2>;
        ranges;

        pma_resv0@58000000 {
            compatible = "shared-dma-pool";
            reg = <0x0 0x58000000 0x0 0x08000000>;
            no-map;
            linux,dma-default;
        };
    };

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
---
v7 -> v8
* Dropped function pointer usage
* Now exporting the functions for clean/inval/flush
* Switched to using early_initcall instead of arch_initcall
* Dropped entry for "include/cache" from MAINTAINERS
* Dropped dependency of RISCV on AX45MP_L2_CACHE
* Returning error in case of cache line mismatch
* Renamed clean/inval/flush functions

v6 -> v7
* Implemented flush callback
* Dropped using riscv_dma_noncoherent_cmo_ops

v5 -> v6
* Moved driver to cache folder
* Switched to new API for CMO

v4 -> v5
* Dropped code for configuring L2 cache
* Dropped code for configuring PMA
* Updated commit message
* Added comments
* Changed static branch enable/disable order

RFC v3 -> v4
* Made use of runtime patching instead of compile time
* Now just exposing single function ax45mp_no_iocp_cmo() for CMO handling
* Added a check to make sure cache line size is always 64 bytes
* Renamed folder rzf -> rzfive
* Improved Kconfig description
* Dropped L2 cache configuration
* Dropped unnecessary casts
* Fixed comments pointed by Geert.
---
 MAINTAINERS                  |   7 ++
 drivers/Kconfig              |   2 +
 drivers/Makefile             |   1 +
 drivers/cache/Kconfig        |  10 ++
 drivers/cache/Makefile       |   3 +
 drivers/cache/ax45mp_cache.c | 222 +++++++++++++++++++++++++++++++++++
 6 files changed, 245 insertions(+)
 create mode 100644 drivers/cache/Kconfig
 create mode 100644 drivers/cache/Makefile
 create mode 100644 drivers/cache/ax45mp_cache.c

Comments

Conor Dooley April 12, 2023, 8:25 p.m. UTC | #1
On Wed, Apr 12, 2023 at 12:08:58PM +0100, Prabhakar wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> 
> I/O Coherence Port (IOCP) provides an AXI interface for connecting
> external non-caching masters, such as DMA controllers. The accesses
> from IOCP are coherent with D-Caches and L2 Cache.
> 
> IOCP is a specification option and is disabled on the Renesas RZ/Five
> SoC due to this reason IP blocks using DMA will fail.
> 
> The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> block that allows dynamic adjustment of memory attributes in the runtime.
> It contains a configurable amount of PMA entries implemented as CSR
> registers to control the attributes of memory locations in interest.
> Below are the memory attributes supported:
> * Device, Non-bufferable
> * Device, bufferable
> * Memory, Non-cacheable, Non-bufferable
> * Memory, Non-cacheable, Bufferable
> * Memory, Write-back, No-allocate
> * Memory, Write-back, Read-allocate
> * Memory, Write-back, Write-allocate
> * Memory, Write-back, Read and Write-allocate
> 
> More info about PMA (section 10.3):
> Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
> 
> As a workaround for SoCs with IOCP disabled CMO needs to be handled by
> software. Firstly OpenSBI configures the memory region as
> "Memory, Non-cacheable, Bufferable" and passes this region as a global
> shared dma pool as a DT node. With DMA_GLOBAL_POOL enabled all DMA
> allocations happen from this region and synchronization callbacks are
> implemented to synchronize when doing DMA transactions.
> 
> Example PMA region passes as a DT node from OpenSBI:
>     reserved-memory {
>         #address-cells = <2>;
>         #size-cells = <2>;
>         ranges;
> 
>         pma_resv0@58000000 {
>             compatible = "shared-dma-pool";
>             reg = <0x0 0x58000000 0x0 0x08000000>;
>             no-map;
>             linux,dma-default;
>         };
>     };
> 
> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> ---
> v7 -> v8
> * Dropped function pointer usage
> * Now exporting the functions for clean/inval/flush
> * Switched to using early_initcall instead of arch_initcall
> * Dropped entry for "include/cache" from MAINTAINERS
> * Dropped dependency of RISCV on AX45MP_L2_CACHE
> * Returning error in case of cache line mismatch

> * Renamed clean/inval/flush functions

I kinda screwed you with that request given Hellwig's NAK on the
function pointer based stuff. Ah well, I prefer matching the proposed
naming of the dma core to what RVI chose for the instructions.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

I suppose this will need a resubmission once Arnd's stuff gets applied,
but I would like to see it have a run through the build bots etc.

Cheers,
Conor.
Conor Dooley April 13, 2023, 7:06 a.m. UTC | #2
On Wed, Apr 12, 2023 at 09:25:34PM +0100, Conor Dooley wrote:
> On Wed, Apr 12, 2023 at 12:08:58PM +0100, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> > 
> > I/O Coherence Port (IOCP) provides an AXI interface for connecting
> > external non-caching masters, such as DMA controllers. The accesses
> > from IOCP are coherent with D-Caches and L2 Cache.
> > 
> > IOCP is a specification option and is disabled on the Renesas RZ/Five
> > SoC due to this reason IP blocks using DMA will fail.
> > 
> > The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> > block that allows dynamic adjustment of memory attributes in the runtime.
> > It contains a configurable amount of PMA entries implemented as CSR
> > registers to control the attributes of memory locations in interest.
> > Below are the memory attributes supported:
> > * Device, Non-bufferable
> > * Device, bufferable
> > * Memory, Non-cacheable, Non-bufferable
> > * Memory, Non-cacheable, Bufferable
> > * Memory, Write-back, No-allocate
> > * Memory, Write-back, Read-allocate
> > * Memory, Write-back, Write-allocate
> > * Memory, Write-back, Read and Write-allocate
> > 
> > More info about PMA (section 10.3):
> > Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
> > 
> > As a workaround for SoCs with IOCP disabled CMO needs to be handled by
> > software. Firstly OpenSBI configures the memory region as
> > "Memory, Non-cacheable, Bufferable" and passes this region as a global
> > shared dma pool as a DT node. With DMA_GLOBAL_POOL enabled all DMA
> > allocations happen from this region and synchronization callbacks are
> > implemented to synchronize when doing DMA transactions.
> > 
> > Example PMA region passes as a DT node from OpenSBI:
> >     reserved-memory {
> >         #address-cells = <2>;
> >         #size-cells = <2>;
> >         ranges;
> > 
> >         pma_resv0@58000000 {
> >             compatible = "shared-dma-pool";
> >             reg = <0x0 0x58000000 0x0 0x08000000>;
> >             no-map;
> >             linux,dma-default;
> >         };
> >     };
> > 
> > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> > ---
> > v7 -> v8
> > * Dropped function pointer usage
> > * Now exporting the functions for clean/inval/flush
> > * Switched to using early_initcall instead of arch_initcall
> > * Dropped entry for "include/cache" from MAINTAINERS
> > * Dropped dependency of RISCV on AX45MP_L2_CACHE
> > * Returning error in case of cache line mismatch
> 
> > * Renamed clean/inval/flush functions
> 
> I kinda screwed you with that request given Hellwig's NAK on the
> function pointer based stuff. Ah well, I prefer matching the proposed
> naming of the dma core to what RVI chose for the instructions.
> 
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> 
> I suppose this will need a resubmission once Arnd's stuff gets applied,
> but I would like to see it have a run through the build bots etc.

So apparently my build bot did actually run against this series?
https://patchwork.kernel.org/project/linux-riscv/list/?series=739109

To be quite honest, I am not sure at all how it managed to apply the
series w/ Arnd's pre-reqs. Perhaps it has achieved some from of
sentience. There's a build failure for 32-bit that appeared on the final
patch, but is not really its fault:
../arch/riscv/mm/dma-noncoherent.c: Assembler messages:
../arch/riscv/mm/dma-noncoherent.c:104: Error: unrecognized opcode `sd s0,0(sp)'
../arch/riscv/mm/dma-noncoherent.c:105: Error: unrecognized opcode `sd ra,8(sp)'
../arch/riscv/mm/dma-noncoherent.c:110: Error: unrecognized opcode `ld ra,8(sp)'
../arch/riscv/mm/dma-noncoherent.c:111: Error: unrecognized opcode `ld s0,0(sp)'
../arch/riscv/mm/dma-noncoherent.c:111: Error: unrecognized opcode `sd s0,0(sp)'
../arch/riscv/mm/dma-noncoherent.c:112: Error: unrecognized opcode `sd ra,8(sp)'
../arch/riscv/mm/dma-noncoherent.c:117: Error: unrecognized opcode `ld ra,8(sp)'
../arch/riscv/mm/dma-noncoherent.c:118: Error: unrecognized opcode `ld s0,0(sp)'
../arch/riscv/mm/pmem.c: Assembler messages:
../arch/riscv/mm/pmem.c:98: Error: unrecognized opcode `sd s0,0(sp)'
../arch/riscv/mm/pmem.c:99: Error: unrecognized opcode `sd ra,8(sp)'
../arch/riscv/mm/pmem.c:104: Error: unrecognized opcode `ld ra,8(sp)'
../arch/riscv/mm/pmem.c:105: Error: unrecognized opcode `ld s0,0(sp)'
../arch/riscv/mm/dma-noncoherent.c:138: Error: unrecognized opcode `sd s0,0(sp)'
../arch/riscv/mm/dma-noncoherent.c:139: Error: unrecognized opcode `sd ra,8(sp)'
../arch/riscv/mm/dma-noncoherent.c:144: Error: unrecognized opcode `ld ra,8(sp)'
../arch/riscv/mm/dma-noncoherent.c:145: Error: unrecognized opcode `ld s0,0(sp)'
../arch/riscv/mm/pmem.c:104: Error: unrecognized opcode `sd s0,0(sp)'
../arch/riscv/mm/pmem.c:105: Error: unrecognized opcode `sd ra,8(sp)'
../arch/riscv/mm/pmem.c:110: Error: unrecognized opcode `ld ra,8(sp)'
../arch/riscv/mm/pmem.c:111: Error: unrecognized opcode `ld s0,0(sp)'
../arch/riscv/mm/pmem.c:110: Error: attempt to move .org backwards
../arch/riscv/mm/pmem.c:116: Error: attempt to move .org backwards
../arch/riscv/mm/dma-noncoherent.c:116: Error: attempt to move .org backwards
../arch/riscv/mm/dma-noncoherent.c:123: Error: attempt to move .org backwards
../arch/riscv/mm/dma-noncoherent.c:150: Error: attempt to move .org backwards
make[4]: *** [../scripts/Makefile.build:252: arch/riscv/mm/pmem.o] Error 1
make[4]: *** [../scripts/Makefile.build:252: arch/riscv/mm/dma-noncoherent.o] Error 1
make[4]: Target 'arch/riscv/mm/' not remade because of errors.
make[3]: *** [../scripts/Makefile.build:494: arch/riscv/mm] Error 2
make[3]: Target 'arch/riscv/' not remade because of errors.
make[2]: *** [../scripts/Makefile.build:494: arch/riscv] Error 2

The simplest solution may to just be making the erratum depend on 64BIT?

Cheers,
Conor.
Lad, Prabhakar April 13, 2023, 6:26 p.m. UTC | #3
Hi Conor,

On Thu, Apr 13, 2023 at 8:06 AM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> On Wed, Apr 12, 2023 at 09:25:34PM +0100, Conor Dooley wrote:
> > On Wed, Apr 12, 2023 at 12:08:58PM +0100, Prabhakar wrote:
> > > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> > >
> > > I/O Coherence Port (IOCP) provides an AXI interface for connecting
> > > external non-caching masters, such as DMA controllers. The accesses
> > > from IOCP are coherent with D-Caches and L2 Cache.
> > >
> > > IOCP is a specification option and is disabled on the Renesas RZ/Five
> > > SoC due to this reason IP blocks using DMA will fail.
> > >
> > > The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> > > block that allows dynamic adjustment of memory attributes in the runtime.
> > > It contains a configurable amount of PMA entries implemented as CSR
> > > registers to control the attributes of memory locations in interest.
> > > Below are the memory attributes supported:
> > > * Device, Non-bufferable
> > > * Device, bufferable
> > > * Memory, Non-cacheable, Non-bufferable
> > > * Memory, Non-cacheable, Bufferable
> > > * Memory, Write-back, No-allocate
> > > * Memory, Write-back, Read-allocate
> > > * Memory, Write-back, Write-allocate
> > > * Memory, Write-back, Read and Write-allocate
> > >
> > > More info about PMA (section 10.3):
> > > Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
> > >
> > > As a workaround for SoCs with IOCP disabled CMO needs to be handled by
> > > software. Firstly OpenSBI configures the memory region as
> > > "Memory, Non-cacheable, Bufferable" and passes this region as a global
> > > shared dma pool as a DT node. With DMA_GLOBAL_POOL enabled all DMA
> > > allocations happen from this region and synchronization callbacks are
> > > implemented to synchronize when doing DMA transactions.
> > >
> > > Example PMA region passes as a DT node from OpenSBI:
> > >     reserved-memory {
> > >         #address-cells = <2>;
> > >         #size-cells = <2>;
> > >         ranges;
> > >
> > >         pma_resv0@58000000 {
> > >             compatible = "shared-dma-pool";
> > >             reg = <0x0 0x58000000 0x0 0x08000000>;
> > >             no-map;
> > >             linux,dma-default;
> > >         };
> > >     };
> > >
> > > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> > > ---
> > > v7 -> v8
> > > * Dropped function pointer usage
> > > * Now exporting the functions for clean/inval/flush
> > > * Switched to using early_initcall instead of arch_initcall
> > > * Dropped entry for "include/cache" from MAINTAINERS
> > > * Dropped dependency of RISCV on AX45MP_L2_CACHE
> > > * Returning error in case of cache line mismatch
> >
> > > * Renamed clean/inval/flush functions
> >
> > I kinda screwed you with that request given Hellwig's NAK on the
> > function pointer based stuff. Ah well, I prefer matching the proposed
> > naming of the dma core to what RVI chose for the instructions.
> >
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> >
> > I suppose this will need a resubmission once Arnd's stuff gets applied,
> > but I would like to see it have a run through the build bots etc.
>
> So apparently my build bot did actually run against this series?
> https://patchwork.kernel.org/project/linux-riscv/list/?series=739109
>
> To be quite honest, I am not sure at all how it managed to apply the
> series w/ Arnd's pre-reqs. Perhaps it has achieved some from of
> sentience. There's a build failure for 32-bit that appeared on the final
> patch, but is not really its fault:
> ../arch/riscv/mm/dma-noncoherent.c: Assembler messages:
> ../arch/riscv/mm/dma-noncoherent.c:104: Error: unrecognized opcode `sd s0,0(sp)'
> ../arch/riscv/mm/dma-noncoherent.c:105: Error: unrecognized opcode `sd ra,8(sp)'
> ../arch/riscv/mm/dma-noncoherent.c:110: Error: unrecognized opcode `ld ra,8(sp)'
> ../arch/riscv/mm/dma-noncoherent.c:111: Error: unrecognized opcode `ld s0,0(sp)'
> ../arch/riscv/mm/dma-noncoherent.c:111: Error: unrecognized opcode `sd s0,0(sp)'
> ../arch/riscv/mm/dma-noncoherent.c:112: Error: unrecognized opcode `sd ra,8(sp)'
> ../arch/riscv/mm/dma-noncoherent.c:117: Error: unrecognized opcode `ld ra,8(sp)'
> ../arch/riscv/mm/dma-noncoherent.c:118: Error: unrecognized opcode `ld s0,0(sp)'
> ../arch/riscv/mm/pmem.c: Assembler messages:
> ../arch/riscv/mm/pmem.c:98: Error: unrecognized opcode `sd s0,0(sp)'
> ../arch/riscv/mm/pmem.c:99: Error: unrecognized opcode `sd ra,8(sp)'
> ../arch/riscv/mm/pmem.c:104: Error: unrecognized opcode `ld ra,8(sp)'
> ../arch/riscv/mm/pmem.c:105: Error: unrecognized opcode `ld s0,0(sp)'
> ../arch/riscv/mm/dma-noncoherent.c:138: Error: unrecognized opcode `sd s0,0(sp)'
> ../arch/riscv/mm/dma-noncoherent.c:139: Error: unrecognized opcode `sd ra,8(sp)'
> ../arch/riscv/mm/dma-noncoherent.c:144: Error: unrecognized opcode `ld ra,8(sp)'
> ../arch/riscv/mm/dma-noncoherent.c:145: Error: unrecognized opcode `ld s0,0(sp)'
> ../arch/riscv/mm/pmem.c:104: Error: unrecognized opcode `sd s0,0(sp)'
> ../arch/riscv/mm/pmem.c:105: Error: unrecognized opcode `sd ra,8(sp)'
> ../arch/riscv/mm/pmem.c:110: Error: unrecognized opcode `ld ra,8(sp)'
> ../arch/riscv/mm/pmem.c:111: Error: unrecognized opcode `ld s0,0(sp)'
> ../arch/riscv/mm/pmem.c:110: Error: attempt to move .org backwards
> ../arch/riscv/mm/pmem.c:116: Error: attempt to move .org backwards
> ../arch/riscv/mm/dma-noncoherent.c:116: Error: attempt to move .org backwards
> ../arch/riscv/mm/dma-noncoherent.c:123: Error: attempt to move .org backwards
> ../arch/riscv/mm/dma-noncoherent.c:150: Error: attempt to move .org backwards
> make[4]: *** [../scripts/Makefile.build:252: arch/riscv/mm/pmem.o] Error 1
> make[4]: *** [../scripts/Makefile.build:252: arch/riscv/mm/dma-noncoherent.o] Error 1
> make[4]: Target 'arch/riscv/mm/' not remade because of errors.
> make[3]: *** [../scripts/Makefile.build:494: arch/riscv/mm] Error 2
> make[3]: Target 'arch/riscv/' not remade because of errors.
> make[2]: *** [../scripts/Makefile.build:494: arch/riscv] Error 2
>
> The simplest solution may to just be making the erratum depend on 64BIT?
>
I dont think this will work, as pmem.c is compiled unconditionally. Is
dma-noncoherent.c also valid for RISCV-32? If not then we can make
pmem.c compile conditionally if DMA non-coherenet is enabled and we
make DMA non-coherent depend on 64bit.

Cheers,
Prabhakar
Conor Dooley April 13, 2023, 6:46 p.m. UTC | #4
On Thu, Apr 13, 2023 at 07:26:02PM +0100, Lad, Prabhakar wrote:

> > The simplest solution may to just be making the erratum depend on 64BIT?
> >
> I dont think this will work, as pmem.c is compiled unconditionally.

That'll teach me to write things like this first thing in the morning.
I somehow got it in my head that the alternative would be removed by the
preprocessor if it was not enabled. After testing it, that's not what
happened.
My excuse is being tired from the gym and insufficiently caffeinated,
sorry!

> Is
> dma-noncoherent.c also valid for RISCV-32? If not then we can make
> pmem.c compile conditionally if DMA non-coherenet is enabled and we
> make DMA non-coherent depend on 64bit.

Could you drop the {s,l}d in exchange for {s,l}w instead, or am I
progressing even further into braino territory?
Lad, Prabhakar April 13, 2023, 9:06 p.m. UTC | #5
On Thu, Apr 13, 2023 at 7:46 PM Conor Dooley <conor@kernel.org> wrote:
>
> > Is
> > dma-noncoherent.c also valid for RISCV-32? If not then we can make
> > pmem.c compile conditionally if DMA non-coherenet is enabled and we
> > make DMA non-coherent depend on 64bit.
>
> Could you drop the {s,l}d in exchange for {s,l}w instead, or am I
> progressing even further into braino territory?
Just the direct exchange wont work in addition shifting + oring to
take care of 64-bit will require. (Correct me if I'm wrong here)

I was wondering now if we need to store/restore the s0 and ra
registers. I stumbled on an X86 implementation which has call [0] in
the ALTERNATIVE_X() macro but here we dont store/restore the
registers. Is the RISC-V implementation of ALT macro different
compared to x86?

[0] https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/uaccess_64.h#L105

Cheers,
Prabhakar
Lad, Prabhakar April 14, 2023, 6:59 p.m. UTC | #6
On Thu, Apr 13, 2023 at 10:06 PM Lad, Prabhakar
<prabhakar.csengg@gmail.com> wrote:
>
> On Thu, Apr 13, 2023 at 7:46 PM Conor Dooley <conor@kernel.org> wrote:
> >
> > > Is
> > > dma-noncoherent.c also valid for RISCV-32? If not then we can make
> > > pmem.c compile conditionally if DMA non-coherenet is enabled and we
> > > make DMA non-coherent depend on 64bit.
> >
> > Could you drop the {s,l}d in exchange for {s,l}w instead, or am I
> > progressing even further into braino territory?
> Just the direct exchange wont work in addition shifting + oring to
> take care of 64-bit will require. (Correct me if I'm wrong here)
>
> I was wondering now if we need to store/restore the s0 and ra
> registers. I stumbled on an X86 implementation which has call [0] in
> the ALTERNATIVE_X() macro but here we dont store/restore the
> registers. Is the RISC-V implementation of ALT macro different
> compared to x86?
>
I did try a call without stroe/restore of s0 and ra registers and that
didn't work!. So I have re-written the assembly code which makes
32-bit RISC-V compilers happy. Once done with the testing I'll send a
new version of this series. Hopefully the last ;)

Cheers,
Prabhakar
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 3afd45f71043..9afd39a23524 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -19898,6 +19898,13 @@  S:	Supported
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
 F:	drivers/staging/
 
+STANDALONE CACHE CONTROLLER DRIVERS
+M:	Conor Dooley <conor@kernel.org>
+L:	linux-riscv@lists.infradead.org
+S:	Maintained
+T:	git https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git/
+F:	drivers/cache
+
 STARFIRE/DURALAN NETWORK DRIVER
 M:	Ion Badulescu <ionut@badula.org>
 S:	Odd Fixes
diff --git a/drivers/Kconfig b/drivers/Kconfig
index 968bd0a6fd78..44abd2cba3a3 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -15,6 +15,8 @@  source "drivers/base/Kconfig"
 
 source "drivers/bus/Kconfig"
 
+source "drivers/cache/Kconfig"
+
 source "drivers/connector/Kconfig"
 
 source "drivers/firmware/Kconfig"
diff --git a/drivers/Makefile b/drivers/Makefile
index 20b118dca999..db5a8115093f 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -11,6 +11,7 @@  ifdef building_out_of_srctree
 MAKEFLAGS += --include-dir=$(srctree)
 endif
 
+obj-y				+= cache/
 obj-y				+= irqchip/
 obj-y				+= bus/
 
diff --git a/drivers/cache/Kconfig b/drivers/cache/Kconfig
new file mode 100644
index 000000000000..b97269cbd149
--- /dev/null
+++ b/drivers/cache/Kconfig
@@ -0,0 +1,10 @@ 
+# SPDX-License-Identifier: GPL-2.0
+menu "Cache Drivers"
+
+config AX45MP_L2_CACHE
+	bool "Andes Technology AX45MP L2 Cache controller"
+	depends on RISCV_DMA_NONCOHERENT
+	help
+	  Support for the L2 cache controller on Andes Technology AX45MP platforms.
+
+endmenu
diff --git a/drivers/cache/Makefile b/drivers/cache/Makefile
new file mode 100644
index 000000000000..2012e7fb978d
--- /dev/null
+++ b/drivers/cache/Makefile
@@ -0,0 +1,3 @@ 
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_AX45MP_L2_CACHE) += ax45mp_cache.o
diff --git a/drivers/cache/ax45mp_cache.c b/drivers/cache/ax45mp_cache.c
new file mode 100644
index 000000000000..cfc40b967c55
--- /dev/null
+++ b/drivers/cache/ax45mp_cache.c
@@ -0,0 +1,222 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * non-coherent cache functions for Andes AX45MP
+ *
+ * Copyright (C) 2023 Renesas Electronics Corp.
+ */
+
+#include <linux/cacheflush.h>
+#include <linux/cacheinfo.h>
+#include <linux/dma-direction.h>
+#include <linux/of_address.h>
+#include <linux/of_platform.h>
+
+/* L2 cache registers */
+#define AX45MP_L2C_REG_CTL_OFFSET		0x8
+
+#define AX45MP_L2C_REG_C0_CMD_OFFSET		0x40
+#define AX45MP_L2C_REG_C0_ACC_OFFSET		0x48
+#define AX45MP_L2C_REG_STATUS_OFFSET		0x80
+
+/* D-cache operation */
+#define AX45MP_CCTL_L1D_VA_INVAL		0 /* Invalidate an L1 cache entry */
+#define AX45MP_CCTL_L1D_VA_WB			1 /* Write-back an L1 cache entry */
+
+/* L2 CCTL status */
+#define AX45MP_CCTL_L2_STATUS_IDLE		0
+
+/* L2 CCTL status cores mask */
+#define AX45MP_CCTL_L2_STATUS_C0_MASK		0xf
+
+/* L2 cache operation */
+#define AX45MP_CCTL_L2_PA_INVAL			0x8 /* Invalidate an L2 cache entry */
+#define AX45MP_CCTL_L2_PA_WB			0x9 /* Write-back an L2 cache entry */
+
+#define AX45MP_L2C_REG_PER_CORE_OFFSET		0x10
+#define AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET	4
+
+#define AX45MP_L2C_REG_CN_CMD_OFFSET(n)	\
+	(AX45MP_L2C_REG_C0_CMD_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET))
+#define AX45MP_L2C_REG_CN_ACC_OFFSET(n)	\
+	(AX45MP_L2C_REG_C0_ACC_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET))
+#define AX45MP_CCTL_L2_STATUS_CN_MASK(n)	\
+	(AX45MP_CCTL_L2_STATUS_C0_MASK << ((n) * AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET))
+
+#define AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM	0x80b
+#define AX45MP_CCTL_REG_UCCTLCOMMAND_NUM	0x80c
+
+#define AX45MP_CACHE_LINE_SIZE			64
+
+struct ax45mp_priv {
+	void __iomem *l2c_base;
+	u32 ax45mp_cache_line_size;
+};
+
+static struct ax45mp_priv ax45mp_priv;
+
+/* L2 Cache operations */
+static inline uint32_t ax45mp_cpu_l2c_get_cctl_status(void)
+{
+	return readl(ax45mp_priv.l2c_base + AX45MP_L2C_REG_STATUS_OFFSET);
+}
+
+static void ax45mp_cpu_cache_operation(unsigned long start, unsigned long end,
+				       unsigned long line_size, unsigned int l1_op,
+				       unsigned int l2_op)
+{
+	void __iomem *base = ax45mp_priv.l2c_base;
+	int mhartid = smp_processor_id();
+	unsigned long pa;
+
+	while (end > start) {
+		csr_write(AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM, start);
+		csr_write(AX45MP_CCTL_REG_UCCTLCOMMAND_NUM, l1_op);
+
+		pa = virt_to_phys((void *)start);
+		writel(pa, base + AX45MP_L2C_REG_CN_ACC_OFFSET(mhartid));
+		writel(l2_op, base + AX45MP_L2C_REG_CN_CMD_OFFSET(mhartid));
+		while ((ax45mp_cpu_l2c_get_cctl_status() &
+			AX45MP_CCTL_L2_STATUS_CN_MASK(mhartid)) !=
+			AX45MP_CCTL_L2_STATUS_IDLE)
+			;
+
+		start += line_size;
+	}
+}
+
+/* Write-back L1 and L2 cache entry */
+static inline void ax45mp_cpu_dcache_wb_range(unsigned long start, unsigned long end,
+					      unsigned long line_size)
+{
+	ax45mp_cpu_cache_operation(start, end, line_size,
+				   AX45MP_CCTL_L1D_VA_WB,
+				   AX45MP_CCTL_L2_PA_WB);
+}
+
+/* Invalidate the L1 and L2 cache entry */
+static inline void ax45mp_cpu_dcache_inval_range(unsigned long start, unsigned long end,
+						 unsigned long line_size)
+{
+	ax45mp_cpu_cache_operation(start, end, line_size,
+				   AX45MP_CCTL_L1D_VA_INVAL,
+				   AX45MP_CCTL_L2_PA_INVAL);
+}
+
+void ax45mp_dma_cache_inv(void *vaddr, unsigned long size)
+{
+	unsigned long start = (unsigned long)vaddr;
+	char cache_buf[2][AX45MP_CACHE_LINE_SIZE];
+	unsigned long end = start + size;
+	unsigned long old_start = start;
+	unsigned long old_end = end;
+	unsigned long line_size;
+	unsigned long flags;
+
+	if (unlikely(start == end))
+		return;
+
+	line_size = ax45mp_priv.ax45mp_cache_line_size;
+
+	memset(&cache_buf, 0x0, sizeof(cache_buf));
+	start = start & (~(line_size - 1));
+	end = ((end + line_size - 1) & (~(line_size - 1)));
+
+	local_irq_save(flags);
+	if (unlikely(start != old_start))
+		memcpy(&cache_buf[0][0], (void *)start, line_size);
+
+	if (unlikely(end != old_end))
+		memcpy(&cache_buf[1][0], (void *)(old_end & (~(line_size - 1))), line_size);
+
+	ax45mp_cpu_dcache_inval_range(start, end, line_size);
+
+	if (unlikely(start != old_start))
+		memcpy((void *)start, &cache_buf[0][0], (old_start & (line_size - 1)));
+
+	local_irq_restore(flags);
+}
+EXPORT_SYMBOL_GPL(ax45mp_dma_cache_inv);
+
+void ax45mp_dma_cache_wback(void *vaddr, unsigned long size)
+{
+	unsigned long start = (unsigned long)vaddr;
+	unsigned long end = start + size;
+	unsigned long line_size;
+	unsigned long flags;
+
+	line_size = ax45mp_priv.ax45mp_cache_line_size;
+	start = start & (~(line_size - 1));
+	local_irq_save(flags);
+	ax45mp_cpu_dcache_wb_range(start, end, line_size);
+	local_irq_restore(flags);
+}
+EXPORT_SYMBOL_GPL(ax45mp_dma_cache_wback);
+
+void ax45mp_dma_cache_wback_inv(void *vaddr, unsigned long size)
+{
+	ax45mp_dma_cache_wback(vaddr, size);
+	ax45mp_dma_cache_inv(vaddr, size);
+}
+EXPORT_SYMBOL_GPL(ax45mp_dma_cache_wback_inv);
+
+static int ax45mp_get_l2_line_size(struct device_node *np)
+{
+	int ret;
+
+	ret = of_property_read_u32(np, "cache-line-size", &ax45mp_priv.ax45mp_cache_line_size);
+	if (ret) {
+		pr_err("Failed to get cache-line-size, defaulting to 64 bytes\n");
+		return ret;
+	}
+
+	if (ax45mp_priv.ax45mp_cache_line_size != AX45MP_CACHE_LINE_SIZE) {
+		pr_err("Expected cache-line-size to be 64 bytes (found:%u)\n",
+		       ax45mp_priv.ax45mp_cache_line_size);
+		return ret;
+	}
+
+	return 0;
+}
+
+static const struct of_device_id ax45mp_cache_ids[] = {
+	{ .compatible = "andestech,ax45mp-cache" },
+	{ /* sentinel */ }
+};
+
+static int __init ax45mp_cache_init(void)
+{
+	struct device_node *np;
+	struct resource res;
+	int ret;
+
+	np = of_find_matching_node(NULL, ax45mp_cache_ids);
+	if (!of_device_is_available(np))
+		return -ENODEV;
+
+	ret = of_address_to_resource(np, 0, &res);
+	if (ret)
+		return ret;
+
+	/*
+	 * If IOCP is present on the Andes AX45MP core riscv_cbom_block_size
+	 * will be 0 for sure, so we can definitely rely on it. If
+	 * riscv_cbom_block_size = 0 we don't need to handle CMO using SW any
+	 * more so we just return success here and only if its being set we
+	 * continue further in the probe path.
+	 */
+	if (!riscv_cbom_block_size)
+		return 0;
+
+	ax45mp_priv.l2c_base = ioremap(res.start, resource_size(&res));
+	if (!ax45mp_priv.l2c_base)
+		return -ENOMEM;
+
+	ret = ax45mp_get_l2_line_size(np);
+	if (ret) {
+		iounmap(ax45mp_priv.l2c_base);
+		return ret;
+	}
+
+	return 0;
+}
+early_initcall(ax45mp_cache_init);