Message ID | 20230412110900.69738-6-prabhakar.mahadev-lad.rj@bp.renesas.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Palmer Dabbelt |
Headers | show |
Series | Add non-coherent DMA support for AX45MP | expand |
Context | Check | Description |
---|---|---|
conchuod/cover_letter | success | Series has a cover letter |
conchuod/tree_selection | success | Guessed tree name to be for-next at HEAD 9c2598d43510 |
conchuod/fixes_present | success | Fixes tag not required for -next series |
conchuod/maintainers_pattern | success | MAINTAINERS pattern errors before the patch: 1 and now 1 |
conchuod/verify_signedoff | success | Signed-off-by tag matches author and committer |
conchuod/kdoc | success | Errors and warnings before: 0 this patch: 0 |
conchuod/build_rv64_clang_allmodconfig | success | Errors and warnings before: 18 this patch: 18 |
conchuod/module_param | success | Was 0 now: 0 |
conchuod/build_rv64_gcc_allmodconfig | success | Errors and warnings before: 18 this patch: 18 |
conchuod/build_rv32_defconfig | success | Build OK |
conchuod/dtb_warn_rv64 | success | Errors and warnings before: 3 this patch: 3 |
conchuod/header_inline | success | No static functions without inline keyword in header files |
conchuod/checkpatch | warning | WARNING: please write a help paragraph that fully describes the config symbol |
conchuod/source_inline | fail | Was 0 now: 3 |
conchuod/build_rv64_nommu_k210_defconfig | success | Build OK |
conchuod/verify_fixes | success | No Fixes tag |
conchuod/build_rv64_nommu_virt_defconfig | success | Build OK |
On Wed, Apr 12, 2023 at 12:08:58PM +0100, Prabhakar wrote: > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> > > I/O Coherence Port (IOCP) provides an AXI interface for connecting > external non-caching masters, such as DMA controllers. The accesses > from IOCP are coherent with D-Caches and L2 Cache. > > IOCP is a specification option and is disabled on the Renesas RZ/Five > SoC due to this reason IP blocks using DMA will fail. > > The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA) > block that allows dynamic adjustment of memory attributes in the runtime. > It contains a configurable amount of PMA entries implemented as CSR > registers to control the attributes of memory locations in interest. > Below are the memory attributes supported: > * Device, Non-bufferable > * Device, bufferable > * Memory, Non-cacheable, Non-bufferable > * Memory, Non-cacheable, Bufferable > * Memory, Write-back, No-allocate > * Memory, Write-back, Read-allocate > * Memory, Write-back, Write-allocate > * Memory, Write-back, Read and Write-allocate > > More info about PMA (section 10.3): > Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf > > As a workaround for SoCs with IOCP disabled CMO needs to be handled by > software. Firstly OpenSBI configures the memory region as > "Memory, Non-cacheable, Bufferable" and passes this region as a global > shared dma pool as a DT node. With DMA_GLOBAL_POOL enabled all DMA > allocations happen from this region and synchronization callbacks are > implemented to synchronize when doing DMA transactions. > > Example PMA region passes as a DT node from OpenSBI: > reserved-memory { > #address-cells = <2>; > #size-cells = <2>; > ranges; > > pma_resv0@58000000 { > compatible = "shared-dma-pool"; > reg = <0x0 0x58000000 0x0 0x08000000>; > no-map; > linux,dma-default; > }; > }; > > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> > --- > v7 -> v8 > * Dropped function pointer usage > * Now exporting the functions for clean/inval/flush > * Switched to using early_initcall instead of arch_initcall > * Dropped entry for "include/cache" from MAINTAINERS > * Dropped dependency of RISCV on AX45MP_L2_CACHE > * Returning error in case of cache line mismatch > * Renamed clean/inval/flush functions I kinda screwed you with that request given Hellwig's NAK on the function pointer based stuff. Ah well, I prefer matching the proposed naming of the dma core to what RVI chose for the instructions. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> I suppose this will need a resubmission once Arnd's stuff gets applied, but I would like to see it have a run through the build bots etc. Cheers, Conor.
On Wed, Apr 12, 2023 at 09:25:34PM +0100, Conor Dooley wrote: > On Wed, Apr 12, 2023 at 12:08:58PM +0100, Prabhakar wrote: > > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> > > > > I/O Coherence Port (IOCP) provides an AXI interface for connecting > > external non-caching masters, such as DMA controllers. The accesses > > from IOCP are coherent with D-Caches and L2 Cache. > > > > IOCP is a specification option and is disabled on the Renesas RZ/Five > > SoC due to this reason IP blocks using DMA will fail. > > > > The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA) > > block that allows dynamic adjustment of memory attributes in the runtime. > > It contains a configurable amount of PMA entries implemented as CSR > > registers to control the attributes of memory locations in interest. > > Below are the memory attributes supported: > > * Device, Non-bufferable > > * Device, bufferable > > * Memory, Non-cacheable, Non-bufferable > > * Memory, Non-cacheable, Bufferable > > * Memory, Write-back, No-allocate > > * Memory, Write-back, Read-allocate > > * Memory, Write-back, Write-allocate > > * Memory, Write-back, Read and Write-allocate > > > > More info about PMA (section 10.3): > > Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf > > > > As a workaround for SoCs with IOCP disabled CMO needs to be handled by > > software. Firstly OpenSBI configures the memory region as > > "Memory, Non-cacheable, Bufferable" and passes this region as a global > > shared dma pool as a DT node. With DMA_GLOBAL_POOL enabled all DMA > > allocations happen from this region and synchronization callbacks are > > implemented to synchronize when doing DMA transactions. > > > > Example PMA region passes as a DT node from OpenSBI: > > reserved-memory { > > #address-cells = <2>; > > #size-cells = <2>; > > ranges; > > > > pma_resv0@58000000 { > > compatible = "shared-dma-pool"; > > reg = <0x0 0x58000000 0x0 0x08000000>; > > no-map; > > linux,dma-default; > > }; > > }; > > > > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> > > --- > > v7 -> v8 > > * Dropped function pointer usage > > * Now exporting the functions for clean/inval/flush > > * Switched to using early_initcall instead of arch_initcall > > * Dropped entry for "include/cache" from MAINTAINERS > > * Dropped dependency of RISCV on AX45MP_L2_CACHE > > * Returning error in case of cache line mismatch > > > * Renamed clean/inval/flush functions > > I kinda screwed you with that request given Hellwig's NAK on the > function pointer based stuff. Ah well, I prefer matching the proposed > naming of the dma core to what RVI chose for the instructions. > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com> > > I suppose this will need a resubmission once Arnd's stuff gets applied, > but I would like to see it have a run through the build bots etc. So apparently my build bot did actually run against this series? https://patchwork.kernel.org/project/linux-riscv/list/?series=739109 To be quite honest, I am not sure at all how it managed to apply the series w/ Arnd's pre-reqs. Perhaps it has achieved some from of sentience. There's a build failure for 32-bit that appeared on the final patch, but is not really its fault: ../arch/riscv/mm/dma-noncoherent.c: Assembler messages: ../arch/riscv/mm/dma-noncoherent.c:104: Error: unrecognized opcode `sd s0,0(sp)' ../arch/riscv/mm/dma-noncoherent.c:105: Error: unrecognized opcode `sd ra,8(sp)' ../arch/riscv/mm/dma-noncoherent.c:110: Error: unrecognized opcode `ld ra,8(sp)' ../arch/riscv/mm/dma-noncoherent.c:111: Error: unrecognized opcode `ld s0,0(sp)' ../arch/riscv/mm/dma-noncoherent.c:111: Error: unrecognized opcode `sd s0,0(sp)' ../arch/riscv/mm/dma-noncoherent.c:112: Error: unrecognized opcode `sd ra,8(sp)' ../arch/riscv/mm/dma-noncoherent.c:117: Error: unrecognized opcode `ld ra,8(sp)' ../arch/riscv/mm/dma-noncoherent.c:118: Error: unrecognized opcode `ld s0,0(sp)' ../arch/riscv/mm/pmem.c: Assembler messages: ../arch/riscv/mm/pmem.c:98: Error: unrecognized opcode `sd s0,0(sp)' ../arch/riscv/mm/pmem.c:99: Error: unrecognized opcode `sd ra,8(sp)' ../arch/riscv/mm/pmem.c:104: Error: unrecognized opcode `ld ra,8(sp)' ../arch/riscv/mm/pmem.c:105: Error: unrecognized opcode `ld s0,0(sp)' ../arch/riscv/mm/dma-noncoherent.c:138: Error: unrecognized opcode `sd s0,0(sp)' ../arch/riscv/mm/dma-noncoherent.c:139: Error: unrecognized opcode `sd ra,8(sp)' ../arch/riscv/mm/dma-noncoherent.c:144: Error: unrecognized opcode `ld ra,8(sp)' ../arch/riscv/mm/dma-noncoherent.c:145: Error: unrecognized opcode `ld s0,0(sp)' ../arch/riscv/mm/pmem.c:104: Error: unrecognized opcode `sd s0,0(sp)' ../arch/riscv/mm/pmem.c:105: Error: unrecognized opcode `sd ra,8(sp)' ../arch/riscv/mm/pmem.c:110: Error: unrecognized opcode `ld ra,8(sp)' ../arch/riscv/mm/pmem.c:111: Error: unrecognized opcode `ld s0,0(sp)' ../arch/riscv/mm/pmem.c:110: Error: attempt to move .org backwards ../arch/riscv/mm/pmem.c:116: Error: attempt to move .org backwards ../arch/riscv/mm/dma-noncoherent.c:116: Error: attempt to move .org backwards ../arch/riscv/mm/dma-noncoherent.c:123: Error: attempt to move .org backwards ../arch/riscv/mm/dma-noncoherent.c:150: Error: attempt to move .org backwards make[4]: *** [../scripts/Makefile.build:252: arch/riscv/mm/pmem.o] Error 1 make[4]: *** [../scripts/Makefile.build:252: arch/riscv/mm/dma-noncoherent.o] Error 1 make[4]: Target 'arch/riscv/mm/' not remade because of errors. make[3]: *** [../scripts/Makefile.build:494: arch/riscv/mm] Error 2 make[3]: Target 'arch/riscv/' not remade because of errors. make[2]: *** [../scripts/Makefile.build:494: arch/riscv] Error 2 The simplest solution may to just be making the erratum depend on 64BIT? Cheers, Conor.
Hi Conor, On Thu, Apr 13, 2023 at 8:06 AM Conor Dooley <conor.dooley@microchip.com> wrote: > > On Wed, Apr 12, 2023 at 09:25:34PM +0100, Conor Dooley wrote: > > On Wed, Apr 12, 2023 at 12:08:58PM +0100, Prabhakar wrote: > > > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> > > > > > > I/O Coherence Port (IOCP) provides an AXI interface for connecting > > > external non-caching masters, such as DMA controllers. The accesses > > > from IOCP are coherent with D-Caches and L2 Cache. > > > > > > IOCP is a specification option and is disabled on the Renesas RZ/Five > > > SoC due to this reason IP blocks using DMA will fail. > > > > > > The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA) > > > block that allows dynamic adjustment of memory attributes in the runtime. > > > It contains a configurable amount of PMA entries implemented as CSR > > > registers to control the attributes of memory locations in interest. > > > Below are the memory attributes supported: > > > * Device, Non-bufferable > > > * Device, bufferable > > > * Memory, Non-cacheable, Non-bufferable > > > * Memory, Non-cacheable, Bufferable > > > * Memory, Write-back, No-allocate > > > * Memory, Write-back, Read-allocate > > > * Memory, Write-back, Write-allocate > > > * Memory, Write-back, Read and Write-allocate > > > > > > More info about PMA (section 10.3): > > > Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf > > > > > > As a workaround for SoCs with IOCP disabled CMO needs to be handled by > > > software. Firstly OpenSBI configures the memory region as > > > "Memory, Non-cacheable, Bufferable" and passes this region as a global > > > shared dma pool as a DT node. With DMA_GLOBAL_POOL enabled all DMA > > > allocations happen from this region and synchronization callbacks are > > > implemented to synchronize when doing DMA transactions. > > > > > > Example PMA region passes as a DT node from OpenSBI: > > > reserved-memory { > > > #address-cells = <2>; > > > #size-cells = <2>; > > > ranges; > > > > > > pma_resv0@58000000 { > > > compatible = "shared-dma-pool"; > > > reg = <0x0 0x58000000 0x0 0x08000000>; > > > no-map; > > > linux,dma-default; > > > }; > > > }; > > > > > > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> > > > --- > > > v7 -> v8 > > > * Dropped function pointer usage > > > * Now exporting the functions for clean/inval/flush > > > * Switched to using early_initcall instead of arch_initcall > > > * Dropped entry for "include/cache" from MAINTAINERS > > > * Dropped dependency of RISCV on AX45MP_L2_CACHE > > > * Returning error in case of cache line mismatch > > > > > * Renamed clean/inval/flush functions > > > > I kinda screwed you with that request given Hellwig's NAK on the > > function pointer based stuff. Ah well, I prefer matching the proposed > > naming of the dma core to what RVI chose for the instructions. > > > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com> > > > > I suppose this will need a resubmission once Arnd's stuff gets applied, > > but I would like to see it have a run through the build bots etc. > > So apparently my build bot did actually run against this series? > https://patchwork.kernel.org/project/linux-riscv/list/?series=739109 > > To be quite honest, I am not sure at all how it managed to apply the > series w/ Arnd's pre-reqs. Perhaps it has achieved some from of > sentience. There's a build failure for 32-bit that appeared on the final > patch, but is not really its fault: > ../arch/riscv/mm/dma-noncoherent.c: Assembler messages: > ../arch/riscv/mm/dma-noncoherent.c:104: Error: unrecognized opcode `sd s0,0(sp)' > ../arch/riscv/mm/dma-noncoherent.c:105: Error: unrecognized opcode `sd ra,8(sp)' > ../arch/riscv/mm/dma-noncoherent.c:110: Error: unrecognized opcode `ld ra,8(sp)' > ../arch/riscv/mm/dma-noncoherent.c:111: Error: unrecognized opcode `ld s0,0(sp)' > ../arch/riscv/mm/dma-noncoherent.c:111: Error: unrecognized opcode `sd s0,0(sp)' > ../arch/riscv/mm/dma-noncoherent.c:112: Error: unrecognized opcode `sd ra,8(sp)' > ../arch/riscv/mm/dma-noncoherent.c:117: Error: unrecognized opcode `ld ra,8(sp)' > ../arch/riscv/mm/dma-noncoherent.c:118: Error: unrecognized opcode `ld s0,0(sp)' > ../arch/riscv/mm/pmem.c: Assembler messages: > ../arch/riscv/mm/pmem.c:98: Error: unrecognized opcode `sd s0,0(sp)' > ../arch/riscv/mm/pmem.c:99: Error: unrecognized opcode `sd ra,8(sp)' > ../arch/riscv/mm/pmem.c:104: Error: unrecognized opcode `ld ra,8(sp)' > ../arch/riscv/mm/pmem.c:105: Error: unrecognized opcode `ld s0,0(sp)' > ../arch/riscv/mm/dma-noncoherent.c:138: Error: unrecognized opcode `sd s0,0(sp)' > ../arch/riscv/mm/dma-noncoherent.c:139: Error: unrecognized opcode `sd ra,8(sp)' > ../arch/riscv/mm/dma-noncoherent.c:144: Error: unrecognized opcode `ld ra,8(sp)' > ../arch/riscv/mm/dma-noncoherent.c:145: Error: unrecognized opcode `ld s0,0(sp)' > ../arch/riscv/mm/pmem.c:104: Error: unrecognized opcode `sd s0,0(sp)' > ../arch/riscv/mm/pmem.c:105: Error: unrecognized opcode `sd ra,8(sp)' > ../arch/riscv/mm/pmem.c:110: Error: unrecognized opcode `ld ra,8(sp)' > ../arch/riscv/mm/pmem.c:111: Error: unrecognized opcode `ld s0,0(sp)' > ../arch/riscv/mm/pmem.c:110: Error: attempt to move .org backwards > ../arch/riscv/mm/pmem.c:116: Error: attempt to move .org backwards > ../arch/riscv/mm/dma-noncoherent.c:116: Error: attempt to move .org backwards > ../arch/riscv/mm/dma-noncoherent.c:123: Error: attempt to move .org backwards > ../arch/riscv/mm/dma-noncoherent.c:150: Error: attempt to move .org backwards > make[4]: *** [../scripts/Makefile.build:252: arch/riscv/mm/pmem.o] Error 1 > make[4]: *** [../scripts/Makefile.build:252: arch/riscv/mm/dma-noncoherent.o] Error 1 > make[4]: Target 'arch/riscv/mm/' not remade because of errors. > make[3]: *** [../scripts/Makefile.build:494: arch/riscv/mm] Error 2 > make[3]: Target 'arch/riscv/' not remade because of errors. > make[2]: *** [../scripts/Makefile.build:494: arch/riscv] Error 2 > > The simplest solution may to just be making the erratum depend on 64BIT? > I dont think this will work, as pmem.c is compiled unconditionally. Is dma-noncoherent.c also valid for RISCV-32? If not then we can make pmem.c compile conditionally if DMA non-coherenet is enabled and we make DMA non-coherent depend on 64bit. Cheers, Prabhakar
On Thu, Apr 13, 2023 at 07:26:02PM +0100, Lad, Prabhakar wrote: > > The simplest solution may to just be making the erratum depend on 64BIT? > > > I dont think this will work, as pmem.c is compiled unconditionally. That'll teach me to write things like this first thing in the morning. I somehow got it in my head that the alternative would be removed by the preprocessor if it was not enabled. After testing it, that's not what happened. My excuse is being tired from the gym and insufficiently caffeinated, sorry! > Is > dma-noncoherent.c also valid for RISCV-32? If not then we can make > pmem.c compile conditionally if DMA non-coherenet is enabled and we > make DMA non-coherent depend on 64bit. Could you drop the {s,l}d in exchange for {s,l}w instead, or am I progressing even further into braino territory?
On Thu, Apr 13, 2023 at 7:46 PM Conor Dooley <conor@kernel.org> wrote: > > > Is > > dma-noncoherent.c also valid for RISCV-32? If not then we can make > > pmem.c compile conditionally if DMA non-coherenet is enabled and we > > make DMA non-coherent depend on 64bit. > > Could you drop the {s,l}d in exchange for {s,l}w instead, or am I > progressing even further into braino territory? Just the direct exchange wont work in addition shifting + oring to take care of 64-bit will require. (Correct me if I'm wrong here) I was wondering now if we need to store/restore the s0 and ra registers. I stumbled on an X86 implementation which has call [0] in the ALTERNATIVE_X() macro but here we dont store/restore the registers. Is the RISC-V implementation of ALT macro different compared to x86? [0] https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/uaccess_64.h#L105 Cheers, Prabhakar
On Thu, Apr 13, 2023 at 10:06 PM Lad, Prabhakar <prabhakar.csengg@gmail.com> wrote: > > On Thu, Apr 13, 2023 at 7:46 PM Conor Dooley <conor@kernel.org> wrote: > > > > > Is > > > dma-noncoherent.c also valid for RISCV-32? If not then we can make > > > pmem.c compile conditionally if DMA non-coherenet is enabled and we > > > make DMA non-coherent depend on 64bit. > > > > Could you drop the {s,l}d in exchange for {s,l}w instead, or am I > > progressing even further into braino territory? > Just the direct exchange wont work in addition shifting + oring to > take care of 64-bit will require. (Correct me if I'm wrong here) > > I was wondering now if we need to store/restore the s0 and ra > registers. I stumbled on an X86 implementation which has call [0] in > the ALTERNATIVE_X() macro but here we dont store/restore the > registers. Is the RISC-V implementation of ALT macro different > compared to x86? > I did try a call without stroe/restore of s0 and ra registers and that didn't work!. So I have re-written the assembly code which makes 32-bit RISC-V compilers happy. Once done with the testing I'll send a new version of this series. Hopefully the last ;) Cheers, Prabhakar
diff --git a/MAINTAINERS b/MAINTAINERS index 3afd45f71043..9afd39a23524 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -19898,6 +19898,13 @@ S: Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git F: drivers/staging/ +STANDALONE CACHE CONTROLLER DRIVERS +M: Conor Dooley <conor@kernel.org> +L: linux-riscv@lists.infradead.org +S: Maintained +T: git https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git/ +F: drivers/cache + STARFIRE/DURALAN NETWORK DRIVER M: Ion Badulescu <ionut@badula.org> S: Odd Fixes diff --git a/drivers/Kconfig b/drivers/Kconfig index 968bd0a6fd78..44abd2cba3a3 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -15,6 +15,8 @@ source "drivers/base/Kconfig" source "drivers/bus/Kconfig" +source "drivers/cache/Kconfig" + source "drivers/connector/Kconfig" source "drivers/firmware/Kconfig" diff --git a/drivers/Makefile b/drivers/Makefile index 20b118dca999..db5a8115093f 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -11,6 +11,7 @@ ifdef building_out_of_srctree MAKEFLAGS += --include-dir=$(srctree) endif +obj-y += cache/ obj-y += irqchip/ obj-y += bus/ diff --git a/drivers/cache/Kconfig b/drivers/cache/Kconfig new file mode 100644 index 000000000000..b97269cbd149 --- /dev/null +++ b/drivers/cache/Kconfig @@ -0,0 +1,10 @@ +# SPDX-License-Identifier: GPL-2.0 +menu "Cache Drivers" + +config AX45MP_L2_CACHE + bool "Andes Technology AX45MP L2 Cache controller" + depends on RISCV_DMA_NONCOHERENT + help + Support for the L2 cache controller on Andes Technology AX45MP platforms. + +endmenu diff --git a/drivers/cache/Makefile b/drivers/cache/Makefile new file mode 100644 index 000000000000..2012e7fb978d --- /dev/null +++ b/drivers/cache/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-$(CONFIG_AX45MP_L2_CACHE) += ax45mp_cache.o diff --git a/drivers/cache/ax45mp_cache.c b/drivers/cache/ax45mp_cache.c new file mode 100644 index 000000000000..cfc40b967c55 --- /dev/null +++ b/drivers/cache/ax45mp_cache.c @@ -0,0 +1,222 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * non-coherent cache functions for Andes AX45MP + * + * Copyright (C) 2023 Renesas Electronics Corp. + */ + +#include <linux/cacheflush.h> +#include <linux/cacheinfo.h> +#include <linux/dma-direction.h> +#include <linux/of_address.h> +#include <linux/of_platform.h> + +/* L2 cache registers */ +#define AX45MP_L2C_REG_CTL_OFFSET 0x8 + +#define AX45MP_L2C_REG_C0_CMD_OFFSET 0x40 +#define AX45MP_L2C_REG_C0_ACC_OFFSET 0x48 +#define AX45MP_L2C_REG_STATUS_OFFSET 0x80 + +/* D-cache operation */ +#define AX45MP_CCTL_L1D_VA_INVAL 0 /* Invalidate an L1 cache entry */ +#define AX45MP_CCTL_L1D_VA_WB 1 /* Write-back an L1 cache entry */ + +/* L2 CCTL status */ +#define AX45MP_CCTL_L2_STATUS_IDLE 0 + +/* L2 CCTL status cores mask */ +#define AX45MP_CCTL_L2_STATUS_C0_MASK 0xf + +/* L2 cache operation */ +#define AX45MP_CCTL_L2_PA_INVAL 0x8 /* Invalidate an L2 cache entry */ +#define AX45MP_CCTL_L2_PA_WB 0x9 /* Write-back an L2 cache entry */ + +#define AX45MP_L2C_REG_PER_CORE_OFFSET 0x10 +#define AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET 4 + +#define AX45MP_L2C_REG_CN_CMD_OFFSET(n) \ + (AX45MP_L2C_REG_C0_CMD_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET)) +#define AX45MP_L2C_REG_CN_ACC_OFFSET(n) \ + (AX45MP_L2C_REG_C0_ACC_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET)) +#define AX45MP_CCTL_L2_STATUS_CN_MASK(n) \ + (AX45MP_CCTL_L2_STATUS_C0_MASK << ((n) * AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET)) + +#define AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM 0x80b +#define AX45MP_CCTL_REG_UCCTLCOMMAND_NUM 0x80c + +#define AX45MP_CACHE_LINE_SIZE 64 + +struct ax45mp_priv { + void __iomem *l2c_base; + u32 ax45mp_cache_line_size; +}; + +static struct ax45mp_priv ax45mp_priv; + +/* L2 Cache operations */ +static inline uint32_t ax45mp_cpu_l2c_get_cctl_status(void) +{ + return readl(ax45mp_priv.l2c_base + AX45MP_L2C_REG_STATUS_OFFSET); +} + +static void ax45mp_cpu_cache_operation(unsigned long start, unsigned long end, + unsigned long line_size, unsigned int l1_op, + unsigned int l2_op) +{ + void __iomem *base = ax45mp_priv.l2c_base; + int mhartid = smp_processor_id(); + unsigned long pa; + + while (end > start) { + csr_write(AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM, start); + csr_write(AX45MP_CCTL_REG_UCCTLCOMMAND_NUM, l1_op); + + pa = virt_to_phys((void *)start); + writel(pa, base + AX45MP_L2C_REG_CN_ACC_OFFSET(mhartid)); + writel(l2_op, base + AX45MP_L2C_REG_CN_CMD_OFFSET(mhartid)); + while ((ax45mp_cpu_l2c_get_cctl_status() & + AX45MP_CCTL_L2_STATUS_CN_MASK(mhartid)) != + AX45MP_CCTL_L2_STATUS_IDLE) + ; + + start += line_size; + } +} + +/* Write-back L1 and L2 cache entry */ +static inline void ax45mp_cpu_dcache_wb_range(unsigned long start, unsigned long end, + unsigned long line_size) +{ + ax45mp_cpu_cache_operation(start, end, line_size, + AX45MP_CCTL_L1D_VA_WB, + AX45MP_CCTL_L2_PA_WB); +} + +/* Invalidate the L1 and L2 cache entry */ +static inline void ax45mp_cpu_dcache_inval_range(unsigned long start, unsigned long end, + unsigned long line_size) +{ + ax45mp_cpu_cache_operation(start, end, line_size, + AX45MP_CCTL_L1D_VA_INVAL, + AX45MP_CCTL_L2_PA_INVAL); +} + +void ax45mp_dma_cache_inv(void *vaddr, unsigned long size) +{ + unsigned long start = (unsigned long)vaddr; + char cache_buf[2][AX45MP_CACHE_LINE_SIZE]; + unsigned long end = start + size; + unsigned long old_start = start; + unsigned long old_end = end; + unsigned long line_size; + unsigned long flags; + + if (unlikely(start == end)) + return; + + line_size = ax45mp_priv.ax45mp_cache_line_size; + + memset(&cache_buf, 0x0, sizeof(cache_buf)); + start = start & (~(line_size - 1)); + end = ((end + line_size - 1) & (~(line_size - 1))); + + local_irq_save(flags); + if (unlikely(start != old_start)) + memcpy(&cache_buf[0][0], (void *)start, line_size); + + if (unlikely(end != old_end)) + memcpy(&cache_buf[1][0], (void *)(old_end & (~(line_size - 1))), line_size); + + ax45mp_cpu_dcache_inval_range(start, end, line_size); + + if (unlikely(start != old_start)) + memcpy((void *)start, &cache_buf[0][0], (old_start & (line_size - 1))); + + local_irq_restore(flags); +} +EXPORT_SYMBOL_GPL(ax45mp_dma_cache_inv); + +void ax45mp_dma_cache_wback(void *vaddr, unsigned long size) +{ + unsigned long start = (unsigned long)vaddr; + unsigned long end = start + size; + unsigned long line_size; + unsigned long flags; + + line_size = ax45mp_priv.ax45mp_cache_line_size; + start = start & (~(line_size - 1)); + local_irq_save(flags); + ax45mp_cpu_dcache_wb_range(start, end, line_size); + local_irq_restore(flags); +} +EXPORT_SYMBOL_GPL(ax45mp_dma_cache_wback); + +void ax45mp_dma_cache_wback_inv(void *vaddr, unsigned long size) +{ + ax45mp_dma_cache_wback(vaddr, size); + ax45mp_dma_cache_inv(vaddr, size); +} +EXPORT_SYMBOL_GPL(ax45mp_dma_cache_wback_inv); + +static int ax45mp_get_l2_line_size(struct device_node *np) +{ + int ret; + + ret = of_property_read_u32(np, "cache-line-size", &ax45mp_priv.ax45mp_cache_line_size); + if (ret) { + pr_err("Failed to get cache-line-size, defaulting to 64 bytes\n"); + return ret; + } + + if (ax45mp_priv.ax45mp_cache_line_size != AX45MP_CACHE_LINE_SIZE) { + pr_err("Expected cache-line-size to be 64 bytes (found:%u)\n", + ax45mp_priv.ax45mp_cache_line_size); + return ret; + } + + return 0; +} + +static const struct of_device_id ax45mp_cache_ids[] = { + { .compatible = "andestech,ax45mp-cache" }, + { /* sentinel */ } +}; + +static int __init ax45mp_cache_init(void) +{ + struct device_node *np; + struct resource res; + int ret; + + np = of_find_matching_node(NULL, ax45mp_cache_ids); + if (!of_device_is_available(np)) + return -ENODEV; + + ret = of_address_to_resource(np, 0, &res); + if (ret) + return ret; + + /* + * If IOCP is present on the Andes AX45MP core riscv_cbom_block_size + * will be 0 for sure, so we can definitely rely on it. If + * riscv_cbom_block_size = 0 we don't need to handle CMO using SW any + * more so we just return success here and only if its being set we + * continue further in the probe path. + */ + if (!riscv_cbom_block_size) + return 0; + + ax45mp_priv.l2c_base = ioremap(res.start, resource_size(&res)); + if (!ax45mp_priv.l2c_base) + return -ENOMEM; + + ret = ax45mp_get_l2_line_size(np); + if (ret) { + iounmap(ax45mp_priv.l2c_base); + return ret; + } + + return 0; +} +early_initcall(ax45mp_cache_init);