Message ID | 20250128-runtime_const_riscv-v3-1-11922989e2d3@rivosinc.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v3] riscv: Add runtime constant support | expand |
Context | Check | Description |
---|---|---|
bjorn/pre-ci_am | success | Success |
bjorn/build-rv32-defconfig | fail | build-rv32-defconfig |
bjorn/build-rv64-clang-allmodconfig | success | build-rv64-clang-allmodconfig |
bjorn/build-rv64-gcc-allmodconfig | success | build-rv64-gcc-allmodconfig |
bjorn/build-rv64-nommu-k210-defconfig | success | build-rv64-nommu-k210-defconfig |
bjorn/build-rv64-nommu-k210-virt | success | build-rv64-nommu-k210-virt |
bjorn/checkpatch | warning | checkpatch |
bjorn/dtb-warn-rv64 | success | dtb-warn-rv64 |
bjorn/header-inline | success | header-inline |
bjorn/kdoc | success | kdoc |
bjorn/module-param | success | module-param |
bjorn/verify-fixes | success | verify-fixes |
bjorn/verify-signedoff | success | verify-signedoff |
Charlie Jenkins wrote: > Implement the runtime constant infrastructure for riscv. Use this > infrastructure to generate constants to be used by the d_hash() > function. > > This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime > constant' support") and commit e3c92e81711d ("runtime constants: add > x86 architecture support"). > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> > --- > Ard brought this to my attention in this patch [1]. > > [1] https://lore.kernel.org/lkml/CAMj1kXE4DJnwFejNWQu784GvyJO=aGNrzuLjSxiowX_e7nW8QA@mail.gmail.com/ > --- > Changes in v3: > - Leverage "pack" instruction for runtime_const_ptr() to reduce hot path > by 3 instructions if Zbkb is supported. Suggested by Pasha Bouzarjomehri (pasha@rivosinc.com) > - Link to v2: https://lore.kernel.org/r/20250127-runtime_const_riscv-v2-1-95ae7cf97a39@rivosinc.com > > Changes in v2: > - Treat instructions as __le32 and do proper conversions (Ben) > - Link to v1: https://lore.kernel.org/r/20250127-runtime_const_riscv-v1-1-795b023ea20b@rivosinc.com > --- > arch/riscv/include/asm/runtime-const.h | 194 +++++++++++++++++++++++++++++++++ > arch/riscv/kernel/vmlinux.lds.S | 3 + > 2 files changed, 197 insertions(+) > > diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h > new file mode 100644 > index 0000000000000000000000000000000000000000..0ecbe6967013900781b0b1048d4622f676b64076 > --- /dev/null > +++ b/arch/riscv/include/asm/runtime-const.h > @@ -0,0 +1,194 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef _ASM_RISCV_RUNTIME_CONST_H > +#define _ASM_RISCV_RUNTIME_CONST_H > + > +#include <asm/alternative.h> > +#include <asm/cacheflush.h> > +#include <asm/text-patching.h> > +#include <linux/uaccess.h> > + > +#ifdef CONFIG_32BIT > +#define runtime_const_ptr(sym) \ > +({ \ > + typeof(sym) __ret, __tmp; \ > + asm_inline("1:\t" \ > + ".option push" \ > + ".option norvc" \ > + "lui %[__ret],0x89abd\n\t" \ > + "addi %[__ret],-0x211\n\t" \ > + ".option pop" \ > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \ > + ".long 1b - .\n\t" \ > + ".popsection" \ > + : [__ret] "=r" (__ret)); \ > + __ret; \ > +}) > +#else > +/* > + * Loading 64-bit constants into a register from immediates is a non-trivial > + * task on riscv64. To get it somewhat performant, load 32 bits into two > + * different registers and then combine the results. > + * > + * If the processor supports the Zbkb extension, we can combine the final > + * "slli,slli,srli,add" into the single "pack" instruction. If the processor > + * doesn't support Zbkb but does support the Zbb extension, we can > + * combine the final "slli,srli,add" into one instruction "add.uw". > + */ > +#define runtime_const_ptr(sym) \ > +({ \ > + typeof(sym) __ret, __tmp; \ > + asm_inline("1:\t" \ > + ".option push\n\t" \ > + ".option norvc\n\t" \ > + "lui %[__ret],0x89abd\n\t" \ > + "lui %[__tmp],0x1234\n\t" \ > + "addiw %[__ret],%[__ret],-0x211\n\t" \ > + "addiw %[__tmp],%[__tmp],0x567\n\t" \ > + ALTERNATIVE_2( \ > + "slli %[__tmp],%[__tmp],32\n\t" \ > + "slli %[__ret],%[__ret],32\n\t" \ > + "srli %[__ret],%[__ret],32\n\t" \ > + "add %[__ret],%[__ret],%[__tmp]\n\t", \ > + ".option push\n\t" \ > + ".option arch,+zba\n\t" \ > + "slli %[__tmp],%[__tmp],32\n\t" \ > + "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \ > + "nop\n\t" \ > + "nop\n\t" \ > + ".option pop\n\t", \ > + 0, RISCV_ISA_EXT_ZBA, 1, \ > + ".option push\n\t" \ > + ".option arch,+zbkb\n\t" \ > + "pack %[__ret],%[__ret],%[__tmp]\n\t" \ > + "nop\n\t" \ > + "nop\n\t" \ > + "nop\n\t" \ > + ".option pop\n\t", \ > + 0, RISCV_ISA_EXT_ZBKB, 1 \ > + ) \ > + ".option pop\n\t" \ > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \ > + ".long 1b - .\n\t" \ > + ".popsection" \ > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \ > + __ret; \ > +}) > +#endif > + > +#ifdef CONFIG_32BIT > +#define SRLI "srli " > +#else > +#define SRLI "srliw " > +#endif > + > +#define runtime_const_shift_right_32(val, sym) \ > +({ \ > + u32 __ret; \ > + asm_inline("1:\t" \ > + ".option push\n\t" \ > + ".option norvc\n\t" \ > + SRLI "%[__ret],%[__val],12\n\t" \ > + ".option pop\n\t" \ > + ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \ > + ".long 1b - .\n\t" \ > + ".popsection" \ > + : [__ret] "=r" (__ret) \ > + : [__val] "r" (val)); \ > + __ret; \ > +}) > + > +#define runtime_const_init(type, sym) do { \ > + extern s32 __start_runtime_##type##_##sym[]; \ > + extern s32 __stop_runtime_##type##_##sym[]; \ > + \ > + runtime_const_fixup(__runtime_fixup_##type, \ > + (unsigned long)(sym), \ > + __start_runtime_##type##_##sym, \ > + __stop_runtime_##type##_##sym); \ > +} while (0) > + > +static inline void __runtime_fixup_caches(void *where, unsigned int insns) > +{ > + /* On riscv there are currently only cache-wide flushes so va is ignored. */ > + __always_unused uintptr_t va = (uintptr_t)where; > + > + flush_icache_range(va, va + 4*insns); > +} > + > +/* > + * The 32-bit immediate is stored in a lui+addi pairing. > + * lui holds the upper 20 bits of the immediate in the first 20 bits of the instruction. > + * addi holds the lower 12 bits of the immediate in the first 12 bits of the instruction. > + */ > +static inline void __runtime_fixup_32(u32 *lui, u32 *addi, unsigned int val) > +{ > + unsigned int lower_immediate, upper_immediate; > + u32 lui_insn = le32_to_cpu(*lui); > + u32 addi_insn = le32_to_cpu(*addi); Because of the compressed extensions RISC-V instructions are only aligned on 16bit boundaries, so is there another reason you know that these two instructions are 32bit aligned? Otherwise you're adding unaligned accesses here. /Emil > + __le32 addi_res, lui_res; > + > + lower_immediate = sign_extend32(val, 11); > + upper_immediate = (val - lower_immediate); > + > + if (upper_immediate & 0xfffff000) { > + /* replace upper 20 bits of lui with upper immediate */ > + lui_insn &= 0x00000fff; > + lui_insn |= upper_immediate & 0xfffff000; > + } else { > + /* replace lui with nop if immediate is small enough to fit in addi */ > + lui_insn = 0x00000013; > + } > + > + if (lower_immediate & 0x00000fff) { > + /* replace upper 12 bits of addi with lower 12 bits of val */ > + addi_insn &= 0x000fffff; > + addi_insn |= (lower_immediate & 0x00000fff) << 20; > + } else { > + /* replace addi with nop if lower_immediate is empty */ > + addi_insn = 0x00000013; > + } > + > + addi_res = cpu_to_le32(addi_insn); > + lui_res = cpu_to_le32(lui_insn); > + patch_insn_write(addi, &addi_res, sizeof(addi_res)); > + patch_insn_write(lui, &lui_res, sizeof(lui_res)); > +} > + > +static inline void __runtime_fixup_ptr(void *where, unsigned long val) > +{ > + if (IS_ENABLED(CONFIG_32BIT)) { > + __runtime_fixup_32(where, where + 4, val); > + __runtime_fixup_caches(where, 2); > + } else { > + __runtime_fixup_32(where, where + 8, val); > + __runtime_fixup_32(where + 4, where + 12, val >> 32); > + __runtime_fixup_caches(where, 4); > + } > +} > + > +/* > + * Replace the least significant 5 bits of the srli/srliw immediate that is > + * located at bits 20-24 > + */ > +static inline void __runtime_fixup_shift(void *where, unsigned long val) > +{ > + u32 insn = le32_to_cpu(*(__le32 *)where); > + __le32 res; > + > + insn &= 0xfe0fffff; > + insn |= (val & 0b11111) << 20; > + > + res = cpu_to_le32(insn); > + patch_text_nosync(where, &res, sizeof(insn)); > +} > + > +static inline void runtime_const_fixup(void (*fn)(void *, unsigned long), > + unsigned long val, s32 *start, s32 *end) > +{ > + while (start < end) { > + fn(*start + (void *)start, val); > + start++; > + } > +} > + > +#endif /* _ASM_RISCV_RUNTIME_CONST_H */ > diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S > index 002ca58dd998cb78b662837b5ebac988fb6c77bb..61bd5ba6680a786bf1db7dc37bf1acda0639b5c7 100644 > --- a/arch/riscv/kernel/vmlinux.lds.S > +++ b/arch/riscv/kernel/vmlinux.lds.S > @@ -97,6 +97,9 @@ SECTIONS > { > EXIT_DATA > } > + > + RUNTIME_CONST_VARIABLES > + > PERCPU_SECTION(L1_CACHE_BYTES) > > .rel.dyn : { > > --- > base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04 > change-id: 20250123-runtime_const_riscv-6cd854ee2817 > -- > - Charlie > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Wed, Jan 29, 2025 at 04:30:49AM -0500, Emil Renner Berthing wrote: > Charlie Jenkins wrote: > > Implement the runtime constant infrastructure for riscv. Use this > > infrastructure to generate constants to be used by the d_hash() > > function. > > > > This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime > > constant' support") and commit e3c92e81711d ("runtime constants: add > > x86 architecture support"). > > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> > > --- > > Ard brought this to my attention in this patch [1]. > > > > [1] https://lore.kernel.org/lkml/CAMj1kXE4DJnwFejNWQu784GvyJO=aGNrzuLjSxiowX_e7nW8QA@mail.gmail.com/ > > --- > > Changes in v3: > > - Leverage "pack" instruction for runtime_const_ptr() to reduce hot path > > by 3 instructions if Zbkb is supported. Suggested by Pasha Bouzarjomehri (pasha@rivosinc.com) > > - Link to v2: https://lore.kernel.org/r/20250127-runtime_const_riscv-v2-1-95ae7cf97a39@rivosinc.com > > > > Changes in v2: > > - Treat instructions as __le32 and do proper conversions (Ben) > > - Link to v1: https://lore.kernel.org/r/20250127-runtime_const_riscv-v1-1-795b023ea20b@rivosinc.com > > --- > > arch/riscv/include/asm/runtime-const.h | 194 +++++++++++++++++++++++++++++++++ > > arch/riscv/kernel/vmlinux.lds.S | 3 + > > 2 files changed, 197 insertions(+) > > > > diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h > > new file mode 100644 > > index 0000000000000000000000000000000000000000..0ecbe6967013900781b0b1048d4622f676b64076 > > --- /dev/null > > +++ b/arch/riscv/include/asm/runtime-const.h > > @@ -0,0 +1,194 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > +#ifndef _ASM_RISCV_RUNTIME_CONST_H > > +#define _ASM_RISCV_RUNTIME_CONST_H > > + > > +#include <asm/alternative.h> > > +#include <asm/cacheflush.h> > > +#include <asm/text-patching.h> > > +#include <linux/uaccess.h> > > + > > +#ifdef CONFIG_32BIT > > +#define runtime_const_ptr(sym) \ > > +({ \ > > + typeof(sym) __ret, __tmp; \ > > + asm_inline("1:\t" \ > > + ".option push" \ > > + ".option norvc" \ > > + "lui %[__ret],0x89abd\n\t" \ > > + "addi %[__ret],-0x211\n\t" \ > > + ".option pop" \ > > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \ > > + ".long 1b - .\n\t" \ > > + ".popsection" \ > > + : [__ret] "=r" (__ret)); \ > > + __ret; \ > > +}) > > +#else > > +/* > > + * Loading 64-bit constants into a register from immediates is a non-trivial > > + * task on riscv64. To get it somewhat performant, load 32 bits into two > > + * different registers and then combine the results. > > + * > > + * If the processor supports the Zbkb extension, we can combine the final > > + * "slli,slli,srli,add" into the single "pack" instruction. If the processor > > + * doesn't support Zbkb but does support the Zbb extension, we can > > + * combine the final "slli,srli,add" into one instruction "add.uw". > > + */ > > +#define runtime_const_ptr(sym) \ > > +({ \ > > + typeof(sym) __ret, __tmp; \ > > + asm_inline("1:\t" \ > > + ".option push\n\t" \ > > + ".option norvc\n\t" \ > > + "lui %[__ret],0x89abd\n\t" \ > > + "lui %[__tmp],0x1234\n\t" \ > > + "addiw %[__ret],%[__ret],-0x211\n\t" \ > > + "addiw %[__tmp],%[__tmp],0x567\n\t" \ > > + ALTERNATIVE_2( \ > > + "slli %[__tmp],%[__tmp],32\n\t" \ > > + "slli %[__ret],%[__ret],32\n\t" \ > > + "srli %[__ret],%[__ret],32\n\t" \ > > + "add %[__ret],%[__ret],%[__tmp]\n\t", \ > > + ".option push\n\t" \ > > + ".option arch,+zba\n\t" \ > > + "slli %[__tmp],%[__tmp],32\n\t" \ > > + "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \ > > + "nop\n\t" \ > > + "nop\n\t" \ > > + ".option pop\n\t", \ > > + 0, RISCV_ISA_EXT_ZBA, 1, \ > > + ".option push\n\t" \ > > + ".option arch,+zbkb\n\t" \ > > + "pack %[__ret],%[__ret],%[__tmp]\n\t" \ > > + "nop\n\t" \ > > + "nop\n\t" \ > > + "nop\n\t" \ > > + ".option pop\n\t", \ > > + 0, RISCV_ISA_EXT_ZBKB, 1 \ > > + ) \ > > + ".option pop\n\t" \ > > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \ > > + ".long 1b - .\n\t" \ > > + ".popsection" \ > > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \ > > + __ret; \ > > +}) > > +#endif > > + > > +#ifdef CONFIG_32BIT > > +#define SRLI "srli " > > +#else > > +#define SRLI "srliw " > > +#endif > > + > > +#define runtime_const_shift_right_32(val, sym) \ > > +({ \ > > + u32 __ret; \ > > + asm_inline("1:\t" \ > > + ".option push\n\t" \ > > + ".option norvc\n\t" \ > > + SRLI "%[__ret],%[__val],12\n\t" \ > > + ".option pop\n\t" \ > > + ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \ > > + ".long 1b - .\n\t" \ > > + ".popsection" \ > > + : [__ret] "=r" (__ret) \ > > + : [__val] "r" (val)); \ > > + __ret; \ > > +}) > > + > > +#define runtime_const_init(type, sym) do { \ > > + extern s32 __start_runtime_##type##_##sym[]; \ > > + extern s32 __stop_runtime_##type##_##sym[]; \ > > + \ > > + runtime_const_fixup(__runtime_fixup_##type, \ > > + (unsigned long)(sym), \ > > + __start_runtime_##type##_##sym, \ > > + __stop_runtime_##type##_##sym); \ > > +} while (0) > > + > > +static inline void __runtime_fixup_caches(void *where, unsigned int insns) > > +{ > > + /* On riscv there are currently only cache-wide flushes so va is ignored. */ > > + __always_unused uintptr_t va = (uintptr_t)where; > > + > > + flush_icache_range(va, va + 4*insns); > > +} > > + > > +/* > > + * The 32-bit immediate is stored in a lui+addi pairing. > > + * lui holds the upper 20 bits of the immediate in the first 20 bits of the instruction. > > + * addi holds the lower 12 bits of the immediate in the first 12 bits of the instruction. > > + */ > > +static inline void __runtime_fixup_32(u32 *lui, u32 *addi, unsigned int val) > > +{ > > + unsigned int lower_immediate, upper_immediate; > > + u32 lui_insn = le32_to_cpu(*lui); > > + u32 addi_insn = le32_to_cpu(*addi); > > Because of the compressed extensions RISC-V instructions are only aligned on > 16bit boundaries, so is there another reason you know that these two > instructions are 32bit aligned? Otherwise you're adding unaligned accesses > here. Great point, thank you. I will add a ".align 4" to the beginning of these instructions to force the alignment. - Charlie > > /Emil > > > + __le32 addi_res, lui_res; > > + > > + lower_immediate = sign_extend32(val, 11); > > + upper_immediate = (val - lower_immediate); > > + > > + if (upper_immediate & 0xfffff000) { > > + /* replace upper 20 bits of lui with upper immediate */ > > + lui_insn &= 0x00000fff; > > + lui_insn |= upper_immediate & 0xfffff000; > > + } else { > > + /* replace lui with nop if immediate is small enough to fit in addi */ > > + lui_insn = 0x00000013; > > + } > > + > > + if (lower_immediate & 0x00000fff) { > > + /* replace upper 12 bits of addi with lower 12 bits of val */ > > + addi_insn &= 0x000fffff; > > + addi_insn |= (lower_immediate & 0x00000fff) << 20; > > + } else { > > + /* replace addi with nop if lower_immediate is empty */ > > + addi_insn = 0x00000013; > > + } > > + > > + addi_res = cpu_to_le32(addi_insn); > > + lui_res = cpu_to_le32(lui_insn); > > + patch_insn_write(addi, &addi_res, sizeof(addi_res)); > > + patch_insn_write(lui, &lui_res, sizeof(lui_res)); > > +} > > + > > +static inline void __runtime_fixup_ptr(void *where, unsigned long val) > > +{ > > + if (IS_ENABLED(CONFIG_32BIT)) { > > + __runtime_fixup_32(where, where + 4, val); > > + __runtime_fixup_caches(where, 2); > > + } else { > > + __runtime_fixup_32(where, where + 8, val); > > + __runtime_fixup_32(where + 4, where + 12, val >> 32); > > + __runtime_fixup_caches(where, 4); > > + } > > +} > > + > > +/* > > + * Replace the least significant 5 bits of the srli/srliw immediate that is > > + * located at bits 20-24 > > + */ > > +static inline void __runtime_fixup_shift(void *where, unsigned long val) > > +{ > > + u32 insn = le32_to_cpu(*(__le32 *)where); > > + __le32 res; > > + > > + insn &= 0xfe0fffff; > > + insn |= (val & 0b11111) << 20; > > + > > + res = cpu_to_le32(insn); > > + patch_text_nosync(where, &res, sizeof(insn)); > > +} > > + > > +static inline void runtime_const_fixup(void (*fn)(void *, unsigned long), > > + unsigned long val, s32 *start, s32 *end) > > +{ > > + while (start < end) { > > + fn(*start + (void *)start, val); > > + start++; > > + } > > +} > > + > > +#endif /* _ASM_RISCV_RUNTIME_CONST_H */ > > diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S > > index 002ca58dd998cb78b662837b5ebac988fb6c77bb..61bd5ba6680a786bf1db7dc37bf1acda0639b5c7 100644 > > --- a/arch/riscv/kernel/vmlinux.lds.S > > +++ b/arch/riscv/kernel/vmlinux.lds.S > > @@ -97,6 +97,9 @@ SECTIONS > > { > > EXIT_DATA > > } > > + > > + RUNTIME_CONST_VARIABLES > > + > > PERCPU_SECTION(L1_CACHE_BYTES) > > > > .rel.dyn : { > > > > --- > > base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04 > > change-id: 20250123-runtime_const_riscv-6cd854ee2817 > > -- > > - Charlie > > > > > > _______________________________________________ > > linux-riscv mailing list > > linux-riscv@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-riscv
Hi Charlie, kernel test robot noticed the following build errors: [auto build test ERROR on ffd294d346d185b70e28b1a28abe367bbfe53c04] url: https://github.com/intel-lab-lkp/linux/commits/Charlie-Jenkins/riscv-Add-runtime-constant-support/20250129-040853 base: ffd294d346d185b70e28b1a28abe367bbfe53c04 patch link: https://lore.kernel.org/r/20250128-runtime_const_riscv-v3-1-11922989e2d3%40rivosinc.com patch subject: [PATCH v3] riscv: Add runtime constant support config: riscv-randconfig-001-20250131 (https://download.01.org/0day-ci/archive/20250131/202501310838.EJsvg1L1-lkp@intel.com/config) compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250131/202501310838.EJsvg1L1-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202501310838.EJsvg1L1-lkp@intel.com/ All error/warnings (new ones prefixed by >>): In file included from include/linux/io.h:14: In file included from arch/riscv/include/asm/io.h:136: include/asm-generic/io.h:812:2: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 812 | insw(addr, buffer, count); | ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/riscv/include/asm/io.h:105:53: note: expanded from macro 'insw' 105 | #define insw(addr, buffer, count) __insw(PCI_IOBASE + (addr), buffer, count) | ~~~~~~~~~~ ^ In file included from fs/dcache.c:29: In file included from include/linux/security.h:35: In file included from include/linux/bpf.h:31: In file included from include/linux/memcontrol.h:13: In file included from include/linux/cgroup.h:26: In file included from include/linux/kernel_stat.h:8: In file included from include/linux/interrupt.h:11: In file included from include/linux/hardirq.h:11: In file included from ./arch/riscv/include/generated/asm/hardirq.h:1: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:14: In file included from arch/riscv/include/asm/io.h:136: include/asm-generic/io.h:820:2: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 820 | insl(addr, buffer, count); | ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/riscv/include/asm/io.h:106:53: note: expanded from macro 'insl' 106 | #define insl(addr, buffer, count) __insl(PCI_IOBASE + (addr), buffer, count) | ~~~~~~~~~~ ^ In file included from fs/dcache.c:29: In file included from include/linux/security.h:35: In file included from include/linux/bpf.h:31: In file included from include/linux/memcontrol.h:13: In file included from include/linux/cgroup.h:26: In file included from include/linux/kernel_stat.h:8: In file included from include/linux/interrupt.h:11: In file included from include/linux/hardirq.h:11: In file included from ./arch/riscv/include/generated/asm/hardirq.h:1: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:14: In file included from arch/riscv/include/asm/io.h:136: include/asm-generic/io.h:829:2: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 829 | outsb(addr, buffer, count); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ arch/riscv/include/asm/io.h:118:55: note: expanded from macro 'outsb' 118 | #define outsb(addr, buffer, count) __outsb(PCI_IOBASE + (addr), buffer, count) | ~~~~~~~~~~ ^ In file included from fs/dcache.c:29: In file included from include/linux/security.h:35: In file included from include/linux/bpf.h:31: In file included from include/linux/memcontrol.h:13: In file included from include/linux/cgroup.h:26: In file included from include/linux/kernel_stat.h:8: In file included from include/linux/interrupt.h:11: In file included from include/linux/hardirq.h:11: In file included from ./arch/riscv/include/generated/asm/hardirq.h:1: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:14: In file included from arch/riscv/include/asm/io.h:136: include/asm-generic/io.h:838:2: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 838 | outsw(addr, buffer, count); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ arch/riscv/include/asm/io.h:119:55: note: expanded from macro 'outsw' 119 | #define outsw(addr, buffer, count) __outsw(PCI_IOBASE + (addr), buffer, count) | ~~~~~~~~~~ ^ In file included from fs/dcache.c:29: In file included from include/linux/security.h:35: In file included from include/linux/bpf.h:31: In file included from include/linux/memcontrol.h:13: In file included from include/linux/cgroup.h:26: In file included from include/linux/kernel_stat.h:8: In file included from include/linux/interrupt.h:11: In file included from include/linux/hardirq.h:11: In file included from ./arch/riscv/include/generated/asm/hardirq.h:1: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:14: In file included from arch/riscv/include/asm/io.h:136: include/asm-generic/io.h:847:2: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 847 | outsl(addr, buffer, count); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ arch/riscv/include/asm/io.h:120:55: note: expanded from macro 'outsl' 120 | #define outsl(addr, buffer, count) __outsl(PCI_IOBASE + (addr), buffer, count) | ~~~~~~~~~~ ^ In file included from fs/dcache.c:29: In file included from include/linux/security.h:35: In file included from include/linux/bpf.h:31: In file included from include/linux/memcontrol.h:13: In file included from include/linux/cgroup.h:26: In file included from include/linux/kernel_stat.h:8: In file included from include/linux/interrupt.h:11: In file included from include/linux/hardirq.h:11: In file included from ./arch/riscv/include/generated/asm/hardirq.h:1: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:14: In file included from arch/riscv/include/asm/io.h:136: include/asm-generic/io.h:1175:55: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 1175 | return (port > MMIO_UPPER_LIMIT) ? NULL : PCI_IOBASE + port; | ~~~~~~~~~~ ^ >> fs/dcache.c:112:9: warning: unused variable '__tmp' [-Wunused-variable] 112 | return runtime_const_ptr(dentry_hashtable) + | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/riscv/include/asm/runtime-const.h:13:21: note: expanded from macro 'runtime_const_ptr' 13 | typeof(sym) __ret, __tmp; \ | ^~~~~ >> fs/dcache.c:112:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'arch', 'relax' or 'norelax' [-Winline-asm] arch/riscv/include/asm/runtime-const.h:14:13: note: expanded from macro 'runtime_const_ptr' 14 | asm_inline("1:\t" \ | ^ <inline asm>:1:25: note: instantiated into assembly here 1 | 1: .option push.option norvclui s2,0x89abd | ^ >> fs/dcache.c:112:9: error: invalid operand for instruction 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:17:26: note: expanded from macro 'runtime_const_ptr' 17 | "lui %[__ret],0x89abd\n\t" \ | ^ <inline asm>:2:10: note: instantiated into assembly here 2 | addi s2,-0x211 | ^ >> fs/dcache.c:112:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'arch', 'relax' or 'norelax' [-Winline-asm] 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:18:26: note: expanded from macro 'runtime_const_ptr' 18 | "addi %[__ret],-0x211\n\t" \ | ^ <inline asm>:3:26: note: instantiated into assembly here 3 | .option pop.pushsection runtime_ptr_dentry_hashtable,"a" | ^ >> fs/dcache.c:112:9: error: .popsection without corresponding .pushsection 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:21:18: note: expanded from macro 'runtime_const_ptr' 21 | ".long 1b - .\n\t" \ | ^ <inline asm>:5:13: note: instantiated into assembly here 5 | .popsection | ^ >> fs/dcache.c:112:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'arch', 'relax' or 'norelax' [-Winline-asm] 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:14:13: note: expanded from macro 'runtime_const_ptr' 14 | asm_inline("1:\t" \ | ^ <inline asm>:1:25: note: instantiated into assembly here 1 | 1: .option push.option norvclui a0,0x89abd | ^ >> fs/dcache.c:112:9: error: invalid operand for instruction 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:17:26: note: expanded from macro 'runtime_const_ptr' 17 | "lui %[__ret],0x89abd\n\t" \ | ^ <inline asm>:2:10: note: instantiated into assembly here 2 | addi a0,-0x211 | ^ >> fs/dcache.c:112:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'arch', 'relax' or 'norelax' [-Winline-asm] 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:18:26: note: expanded from macro 'runtime_const_ptr' 18 | "addi %[__ret],-0x211\n\t" \ | ^ <inline asm>:3:26: note: instantiated into assembly here 3 | .option pop.pushsection runtime_ptr_dentry_hashtable,"a" | ^ >> fs/dcache.c:112:9: error: .popsection without corresponding .pushsection 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:21:18: note: expanded from macro 'runtime_const_ptr' 21 | ".long 1b - .\n\t" \ | ^ <inline asm>:5:13: note: instantiated into assembly here 5 | .popsection | ^ >> fs/dcache.c:112:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'arch', 'relax' or 'norelax' [-Winline-asm] 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:14:13: note: expanded from macro 'runtime_const_ptr' 14 | asm_inline("1:\t" \ | ^ <inline asm>:1:25: note: instantiated into assembly here 1 | 1: .option push.option norvclui a5,0x89abd | ^ >> fs/dcache.c:112:9: error: invalid operand for instruction 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:17:26: note: expanded from macro 'runtime_const_ptr' 17 | "lui %[__ret],0x89abd\n\t" \ | ^ <inline asm>:2:10: note: instantiated into assembly here 2 | addi a5,-0x211 | ^ >> fs/dcache.c:112:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'arch', 'relax' or 'norelax' [-Winline-asm] 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:18:26: note: expanded from macro 'runtime_const_ptr' 18 | "addi %[__ret],-0x211\n\t" \ | ^ <inline asm>:3:26: note: instantiated into assembly here 3 | .option pop.pushsection runtime_ptr_dentry_hashtable,"a" | ^ >> fs/dcache.c:112:9: error: .popsection without corresponding .pushsection 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:21:18: note: expanded from macro 'runtime_const_ptr' 21 | ".long 1b - .\n\t" \ | ^ <inline asm>:5:13: note: instantiated into assembly here 5 | .popsection | ^ >> fs/dcache.c:112:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'arch', 'relax' or 'norelax' [-Winline-asm] 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:14:13: note: expanded from macro 'runtime_const_ptr' 14 | asm_inline("1:\t" \ | ^ <inline asm>:1:25: note: instantiated into assembly here 1 | 1: .option push.option norvclui a1,0x89abd | ^ >> fs/dcache.c:112:9: error: invalid operand for instruction 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:17:26: note: expanded from macro 'runtime_const_ptr' 17 | "lui %[__ret],0x89abd\n\t" \ | ^ <inline asm>:2:10: note: instantiated into assembly here 2 | addi a1,-0x211 | ^ >> fs/dcache.c:112:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'arch', 'relax' or 'norelax' [-Winline-asm] 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:18:26: note: expanded from macro 'runtime_const_ptr' 18 | "addi %[__ret],-0x211\n\t" \ | ^ <inline asm>:3:26: note: instantiated into assembly here 3 | .option pop.pushsection runtime_ptr_dentry_hashtable,"a" | ^ >> fs/dcache.c:112:9: error: .popsection without corresponding .pushsection 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:21:18: note: expanded from macro 'runtime_const_ptr' 21 | ".long 1b - .\n\t" \ | ^ <inline asm>:5:13: note: instantiated into assembly here 5 | .popsection | ^ >> fs/dcache.c:112:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'arch', 'relax' or 'norelax' [-Winline-asm] 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:14:13: note: expanded from macro 'runtime_const_ptr' 14 | asm_inline("1:\t" \ | ^ <inline asm>:1:25: note: instantiated into assembly here 1 | 1: .option push.option norvclui a0,0x89abd | ^ >> fs/dcache.c:112:9: error: invalid operand for instruction 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:17:26: note: expanded from macro 'runtime_const_ptr' 17 | "lui %[__ret],0x89abd\n\t" \ | ^ <inline asm>:2:10: note: instantiated into assembly here 2 | addi a0,-0x211 | ^ >> fs/dcache.c:112:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'arch', 'relax' or 'norelax' [-Winline-asm] 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:18:26: note: expanded from macro 'runtime_const_ptr' 18 | "addi %[__ret],-0x211\n\t" \ | ^ <inline asm>:3:26: note: instantiated into assembly here 3 | .option pop.pushsection runtime_ptr_dentry_hashtable,"a" | ^ fs/dcache.c:112:9: error: .popsection without corresponding .pushsection 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:21:18: note: expanded from macro 'runtime_const_ptr' 21 | ".long 1b - .\n\t" \ | ^ <inline asm>:5:13: note: instantiated into assembly here 5 | .popsection | ^ fs/dcache.c:112:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'arch', 'relax' or 'norelax' [-Winline-asm] 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:14:13: note: expanded from macro 'runtime_const_ptr' 14 | asm_inline("1:\t" \ | ^ <inline asm>:1:25: note: instantiated into assembly here 1 | 1: .option push.option norvclui s2,0x89abd | ^ fs/dcache.c:112:9: error: invalid operand for instruction 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:17:26: note: expanded from macro 'runtime_const_ptr' 17 | "lui %[__ret],0x89abd\n\t" \ | ^ <inline asm>:2:10: note: instantiated into assembly here 2 | addi s2,-0x211 | ^ fs/dcache.c:112:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'arch', 'relax' or 'norelax' [-Winline-asm] 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:18:26: note: expanded from macro 'runtime_const_ptr' 18 | "addi %[__ret],-0x211\n\t" \ | ^ <inline asm>:3:26: note: instantiated into assembly here 3 | .option pop.pushsection runtime_ptr_dentry_hashtable,"a" | ^ fs/dcache.c:112:9: error: .popsection without corresponding .pushsection 112 | return runtime_const_ptr(dentry_hashtable) + | ^ arch/riscv/include/asm/runtime-const.h:21:18: note: expanded from macro 'runtime_const_ptr' 21 | ".long 1b - .\n\t" \ | ^ <inline asm>:5:13: note: instantiated into assembly here 5 | .popsection | ^ 20 warnings and 12 errors generated. vim +/__tmp +112 fs/dcache.c ceb5bdc2d246f6 Nicholas Piggin 2011-01-07 109 e60cc61153e61e Linus Torvalds 2024-06-10 110 static inline struct hlist_bl_head *d_hash(unsigned long hashlen) ceb5bdc2d246f6 Nicholas Piggin 2011-01-07 111 { e78298556ee5d8 Linus Torvalds 2024-06-04 @112 return runtime_const_ptr(dentry_hashtable) + e78298556ee5d8 Linus Torvalds 2024-06-04 113 runtime_const_shift_right_32(hashlen, d_hash_shift); ceb5bdc2d246f6 Nicholas Piggin 2011-01-07 114 } ceb5bdc2d246f6 Nicholas Piggin 2011-01-07 115
Charlie Jenkins wrote: > On Wed, Jan 29, 2025 at 04:30:49AM -0500, Emil Renner Berthing wrote: > > Charlie Jenkins wrote: > > > Implement the runtime constant infrastructure for riscv. Use this > > > infrastructure to generate constants to be used by the d_hash() > > > function. > > > > > > This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime > > > constant' support") and commit e3c92e81711d ("runtime constants: add > > > x86 architecture support"). > > > > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> > > > --- > > > Ard brought this to my attention in this patch [1]. > > > > > > [1] https://lore.kernel.org/lkml/CAMj1kXE4DJnwFejNWQu784GvyJO=aGNrzuLjSxiowX_e7nW8QA@mail.gmail.com/ > > > --- > > > Changes in v3: > > > - Leverage "pack" instruction for runtime_const_ptr() to reduce hot path > > > by 3 instructions if Zbkb is supported. Suggested by Pasha Bouzarjomehri (pasha@rivosinc.com) > > > - Link to v2: https://lore.kernel.org/r/20250127-runtime_const_riscv-v2-1-95ae7cf97a39@rivosinc.com > > > > > > Changes in v2: > > > - Treat instructions as __le32 and do proper conversions (Ben) > > > - Link to v1: https://lore.kernel.org/r/20250127-runtime_const_riscv-v1-1-795b023ea20b@rivosinc.com > > > --- > > > arch/riscv/include/asm/runtime-const.h | 194 +++++++++++++++++++++++++++++++++ > > > arch/riscv/kernel/vmlinux.lds.S | 3 + > > > 2 files changed, 197 insertions(+) > > > > > > diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h > > > new file mode 100644 > > > index 0000000000000000000000000000000000000000..0ecbe6967013900781b0b1048d4622f676b64076 > > > --- /dev/null > > > +++ b/arch/riscv/include/asm/runtime-const.h > > > @@ -0,0 +1,194 @@ > > > +/* SPDX-License-Identifier: GPL-2.0 */ > > > +#ifndef _ASM_RISCV_RUNTIME_CONST_H > > > +#define _ASM_RISCV_RUNTIME_CONST_H > > > + > > > +#include <asm/alternative.h> > > > +#include <asm/cacheflush.h> > > > +#include <asm/text-patching.h> > > > +#include <linux/uaccess.h> > > > + > > > +#ifdef CONFIG_32BIT > > > +#define runtime_const_ptr(sym) \ > > > +({ \ > > > + typeof(sym) __ret, __tmp; \ > > > + asm_inline("1:\t" \ > > > + ".option push" \ > > > + ".option norvc" \ > > > + "lui %[__ret],0x89abd\n\t" \ > > > + "addi %[__ret],-0x211\n\t" \ > > > + ".option pop" \ > > > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \ > > > + ".long 1b - .\n\t" \ > > > + ".popsection" \ > > > + : [__ret] "=r" (__ret)); \ > > > + __ret; \ > > > +}) > > > +#else > > > +/* > > > + * Loading 64-bit constants into a register from immediates is a non-trivial > > > + * task on riscv64. To get it somewhat performant, load 32 bits into two > > > + * different registers and then combine the results. > > > + * > > > + * If the processor supports the Zbkb extension, we can combine the final > > > + * "slli,slli,srli,add" into the single "pack" instruction. If the processor > > > + * doesn't support Zbkb but does support the Zbb extension, we can > > > + * combine the final "slli,srli,add" into one instruction "add.uw". > > > + */ > > > +#define runtime_const_ptr(sym) \ > > > +({ \ > > > + typeof(sym) __ret, __tmp; \ > > > + asm_inline("1:\t" \ > > > + ".option push\n\t" \ > > > + ".option norvc\n\t" \ > > > + "lui %[__ret],0x89abd\n\t" \ > > > + "lui %[__tmp],0x1234\n\t" \ > > > + "addiw %[__ret],%[__ret],-0x211\n\t" \ > > > + "addiw %[__tmp],%[__tmp],0x567\n\t" \ > > > + ALTERNATIVE_2( \ > > > + "slli %[__tmp],%[__tmp],32\n\t" \ > > > + "slli %[__ret],%[__ret],32\n\t" \ > > > + "srli %[__ret],%[__ret],32\n\t" \ > > > + "add %[__ret],%[__ret],%[__tmp]\n\t", \ > > > + ".option push\n\t" \ > > > + ".option arch,+zba\n\t" \ > > > + "slli %[__tmp],%[__tmp],32\n\t" \ > > > + "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \ > > > + "nop\n\t" \ > > > + "nop\n\t" \ > > > + ".option pop\n\t", \ > > > + 0, RISCV_ISA_EXT_ZBA, 1, \ > > > + ".option push\n\t" \ > > > + ".option arch,+zbkb\n\t" \ > > > + "pack %[__ret],%[__ret],%[__tmp]\n\t" \ > > > + "nop\n\t" \ > > > + "nop\n\t" \ > > > + "nop\n\t" \ > > > + ".option pop\n\t", \ > > > + 0, RISCV_ISA_EXT_ZBKB, 1 \ > > > + ) \ > > > + ".option pop\n\t" \ > > > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \ > > > + ".long 1b - .\n\t" \ > > > + ".popsection" \ > > > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \ > > > + __ret; \ > > > +}) > > > +#endif > > > + > > > +#ifdef CONFIG_32BIT > > > +#define SRLI "srli " > > > +#else > > > +#define SRLI "srliw " > > > +#endif > > > + > > > +#define runtime_const_shift_right_32(val, sym) \ > > > +({ \ > > > + u32 __ret; \ > > > + asm_inline("1:\t" \ > > > + ".option push\n\t" \ > > > + ".option norvc\n\t" \ > > > + SRLI "%[__ret],%[__val],12\n\t" \ > > > + ".option pop\n\t" \ > > > + ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \ > > > + ".long 1b - .\n\t" \ > > > + ".popsection" \ > > > + : [__ret] "=r" (__ret) \ > > > + : [__val] "r" (val)); \ > > > + __ret; \ > > > +}) > > > + > > > +#define runtime_const_init(type, sym) do { \ > > > + extern s32 __start_runtime_##type##_##sym[]; \ > > > + extern s32 __stop_runtime_##type##_##sym[]; \ > > > + \ > > > + runtime_const_fixup(__runtime_fixup_##type, \ > > > + (unsigned long)(sym), \ > > > + __start_runtime_##type##_##sym, \ > > > + __stop_runtime_##type##_##sym); \ > > > +} while (0) > > > + > > > +static inline void __runtime_fixup_caches(void *where, unsigned int insns) > > > +{ > > > + /* On riscv there are currently only cache-wide flushes so va is ignored. */ > > > + __always_unused uintptr_t va = (uintptr_t)where; > > > + > > > + flush_icache_range(va, va + 4*insns); > > > +} > > > + > > > +/* > > > + * The 32-bit immediate is stored in a lui+addi pairing. > > > + * lui holds the upper 20 bits of the immediate in the first 20 bits of the instruction. > > > + * addi holds the lower 12 bits of the immediate in the first 12 bits of the instruction. > > > + */ > > > +static inline void __runtime_fixup_32(u32 *lui, u32 *addi, unsigned int val) > > > +{ > > > + unsigned int lower_immediate, upper_immediate; > > > + u32 lui_insn = le32_to_cpu(*lui); > > > + u32 addi_insn = le32_to_cpu(*addi); > > > > Because of the compressed extensions RISC-V instructions are only aligned on > > 16bit boundaries, so is there another reason you know that these two > > instructions are 32bit aligned? Otherwise you're adding unaligned accesses > > here. > > Great point, thank you. I will add a ".align 4" to the beginning of > these instructions to force the alignment. Unless there is a specific reason to do the alignment in this case, I'd much rather we just do two 16bit reads like riscv/kernel/module.c already does. /Emil
On Fri, Jan 31, 2025 at 06:46:23AM -0800, Emil Renner Berthing wrote: > Charlie Jenkins wrote: > > On Wed, Jan 29, 2025 at 04:30:49AM -0500, Emil Renner Berthing wrote: > > > Charlie Jenkins wrote: > > > > Implement the runtime constant infrastructure for riscv. Use this > > > > infrastructure to generate constants to be used by the d_hash() > > > > function. > > > > > > > > This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime > > > > constant' support") and commit e3c92e81711d ("runtime constants: add > > > > x86 architecture support"). > > > > > > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> > > > > --- > > > > Ard brought this to my attention in this patch [1]. > > > > > > > > [1] https://lore.kernel.org/lkml/CAMj1kXE4DJnwFejNWQu784GvyJO=aGNrzuLjSxiowX_e7nW8QA@mail.gmail.com/ > > > > --- > > > > Changes in v3: > > > > - Leverage "pack" instruction for runtime_const_ptr() to reduce hot path > > > > by 3 instructions if Zbkb is supported. Suggested by Pasha Bouzarjomehri (pasha@rivosinc.com) > > > > - Link to v2: https://lore.kernel.org/r/20250127-runtime_const_riscv-v2-1-95ae7cf97a39@rivosinc.com > > > > > > > > Changes in v2: > > > > - Treat instructions as __le32 and do proper conversions (Ben) > > > > - Link to v1: https://lore.kernel.org/r/20250127-runtime_const_riscv-v1-1-795b023ea20b@rivosinc.com > > > > --- > > > > arch/riscv/include/asm/runtime-const.h | 194 +++++++++++++++++++++++++++++++++ > > > > arch/riscv/kernel/vmlinux.lds.S | 3 + > > > > 2 files changed, 197 insertions(+) > > > > > > > > diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h > > > > new file mode 100644 > > > > index 0000000000000000000000000000000000000000..0ecbe6967013900781b0b1048d4622f676b64076 > > > > --- /dev/null > > > > +++ b/arch/riscv/include/asm/runtime-const.h > > > > @@ -0,0 +1,194 @@ > > > > +/* SPDX-License-Identifier: GPL-2.0 */ > > > > +#ifndef _ASM_RISCV_RUNTIME_CONST_H > > > > +#define _ASM_RISCV_RUNTIME_CONST_H > > > > + > > > > +#include <asm/alternative.h> > > > > +#include <asm/cacheflush.h> > > > > +#include <asm/text-patching.h> > > > > +#include <linux/uaccess.h> > > > > + > > > > +#ifdef CONFIG_32BIT > > > > +#define runtime_const_ptr(sym) \ > > > > +({ \ > > > > + typeof(sym) __ret, __tmp; \ > > > > + asm_inline("1:\t" \ > > > > + ".option push" \ > > > > + ".option norvc" \ > > > > + "lui %[__ret],0x89abd\n\t" \ > > > > + "addi %[__ret],-0x211\n\t" \ > > > > + ".option pop" \ > > > > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \ > > > > + ".long 1b - .\n\t" \ > > > > + ".popsection" \ > > > > + : [__ret] "=r" (__ret)); \ > > > > + __ret; \ > > > > +}) > > > > +#else > > > > +/* > > > > + * Loading 64-bit constants into a register from immediates is a non-trivial > > > > + * task on riscv64. To get it somewhat performant, load 32 bits into two > > > > + * different registers and then combine the results. > > > > + * > > > > + * If the processor supports the Zbkb extension, we can combine the final > > > > + * "slli,slli,srli,add" into the single "pack" instruction. If the processor > > > > + * doesn't support Zbkb but does support the Zbb extension, we can > > > > + * combine the final "slli,srli,add" into one instruction "add.uw". > > > > + */ > > > > +#define runtime_const_ptr(sym) \ > > > > +({ \ > > > > + typeof(sym) __ret, __tmp; \ > > > > + asm_inline("1:\t" \ > > > > + ".option push\n\t" \ > > > > + ".option norvc\n\t" \ > > > > + "lui %[__ret],0x89abd\n\t" \ > > > > + "lui %[__tmp],0x1234\n\t" \ > > > > + "addiw %[__ret],%[__ret],-0x211\n\t" \ > > > > + "addiw %[__tmp],%[__tmp],0x567\n\t" \ > > > > + ALTERNATIVE_2( \ > > > > + "slli %[__tmp],%[__tmp],32\n\t" \ > > > > + "slli %[__ret],%[__ret],32\n\t" \ > > > > + "srli %[__ret],%[__ret],32\n\t" \ > > > > + "add %[__ret],%[__ret],%[__tmp]\n\t", \ > > > > + ".option push\n\t" \ > > > > + ".option arch,+zba\n\t" \ > > > > + "slli %[__tmp],%[__tmp],32\n\t" \ > > > > + "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \ > > > > + "nop\n\t" \ > > > > + "nop\n\t" \ > > > > + ".option pop\n\t", \ > > > > + 0, RISCV_ISA_EXT_ZBA, 1, \ > > > > + ".option push\n\t" \ > > > > + ".option arch,+zbkb\n\t" \ > > > > + "pack %[__ret],%[__ret],%[__tmp]\n\t" \ > > > > + "nop\n\t" \ > > > > + "nop\n\t" \ > > > > + "nop\n\t" \ > > > > + ".option pop\n\t", \ > > > > + 0, RISCV_ISA_EXT_ZBKB, 1 \ > > > > + ) \ > > > > + ".option pop\n\t" \ > > > > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \ > > > > + ".long 1b - .\n\t" \ > > > > + ".popsection" \ > > > > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \ > > > > + __ret; \ > > > > +}) > > > > +#endif > > > > + > > > > +#ifdef CONFIG_32BIT > > > > +#define SRLI "srli " > > > > +#else > > > > +#define SRLI "srliw " > > > > +#endif > > > > + > > > > +#define runtime_const_shift_right_32(val, sym) \ > > > > +({ \ > > > > + u32 __ret; \ > > > > + asm_inline("1:\t" \ > > > > + ".option push\n\t" \ > > > > + ".option norvc\n\t" \ > > > > + SRLI "%[__ret],%[__val],12\n\t" \ > > > > + ".option pop\n\t" \ > > > > + ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \ > > > > + ".long 1b - .\n\t" \ > > > > + ".popsection" \ > > > > + : [__ret] "=r" (__ret) \ > > > > + : [__val] "r" (val)); \ > > > > + __ret; \ > > > > +}) > > > > + > > > > +#define runtime_const_init(type, sym) do { \ > > > > + extern s32 __start_runtime_##type##_##sym[]; \ > > > > + extern s32 __stop_runtime_##type##_##sym[]; \ > > > > + \ > > > > + runtime_const_fixup(__runtime_fixup_##type, \ > > > > + (unsigned long)(sym), \ > > > > + __start_runtime_##type##_##sym, \ > > > > + __stop_runtime_##type##_##sym); \ > > > > +} while (0) > > > > + > > > > +static inline void __runtime_fixup_caches(void *where, unsigned int insns) > > > > +{ > > > > + /* On riscv there are currently only cache-wide flushes so va is ignored. */ > > > > + __always_unused uintptr_t va = (uintptr_t)where; > > > > + > > > > + flush_icache_range(va, va + 4*insns); > > > > +} > > > > + > > > > +/* > > > > + * The 32-bit immediate is stored in a lui+addi pairing. > > > > + * lui holds the upper 20 bits of the immediate in the first 20 bits of the instruction. > > > > + * addi holds the lower 12 bits of the immediate in the first 12 bits of the instruction. > > > > + */ > > > > +static inline void __runtime_fixup_32(u32 *lui, u32 *addi, unsigned int val) > > > > +{ > > > > + unsigned int lower_immediate, upper_immediate; > > > > + u32 lui_insn = le32_to_cpu(*lui); > > > > + u32 addi_insn = le32_to_cpu(*addi); > > > > > > Because of the compressed extensions RISC-V instructions are only aligned on > > > 16bit boundaries, so is there another reason you know that these two > > > instructions are 32bit aligned? Otherwise you're adding unaligned accesses > > > here. > > > > Great point, thank you. I will add a ".align 4" to the beginning of > > these instructions to force the alignment. > > Unless there is a specific reason to do the alignment in this case, I'd much > rather we just do two 16bit reads like riscv/kernel/module.c already does. I have no strong preference here. That does seem like a better solution though. - Charlie > > /Emil
diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h new file mode 100644 index 0000000000000000000000000000000000000000..0ecbe6967013900781b0b1048d4622f676b64076 --- /dev/null +++ b/arch/riscv/include/asm/runtime-const.h @@ -0,0 +1,194 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_RISCV_RUNTIME_CONST_H +#define _ASM_RISCV_RUNTIME_CONST_H + +#include <asm/alternative.h> +#include <asm/cacheflush.h> +#include <asm/text-patching.h> +#include <linux/uaccess.h> + +#ifdef CONFIG_32BIT +#define runtime_const_ptr(sym) \ +({ \ + typeof(sym) __ret, __tmp; \ + asm_inline("1:\t" \ + ".option push" \ + ".option norvc" \ + "lui %[__ret],0x89abd\n\t" \ + "addi %[__ret],-0x211\n\t" \ + ".option pop" \ + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \ + ".long 1b - .\n\t" \ + ".popsection" \ + : [__ret] "=r" (__ret)); \ + __ret; \ +}) +#else +/* + * Loading 64-bit constants into a register from immediates is a non-trivial + * task on riscv64. To get it somewhat performant, load 32 bits into two + * different registers and then combine the results. + * + * If the processor supports the Zbkb extension, we can combine the final + * "slli,slli,srli,add" into the single "pack" instruction. If the processor + * doesn't support Zbkb but does support the Zbb extension, we can + * combine the final "slli,srli,add" into one instruction "add.uw". + */ +#define runtime_const_ptr(sym) \ +({ \ + typeof(sym) __ret, __tmp; \ + asm_inline("1:\t" \ + ".option push\n\t" \ + ".option norvc\n\t" \ + "lui %[__ret],0x89abd\n\t" \ + "lui %[__tmp],0x1234\n\t" \ + "addiw %[__ret],%[__ret],-0x211\n\t" \ + "addiw %[__tmp],%[__tmp],0x567\n\t" \ + ALTERNATIVE_2( \ + "slli %[__tmp],%[__tmp],32\n\t" \ + "slli %[__ret],%[__ret],32\n\t" \ + "srli %[__ret],%[__ret],32\n\t" \ + "add %[__ret],%[__ret],%[__tmp]\n\t", \ + ".option push\n\t" \ + ".option arch,+zba\n\t" \ + "slli %[__tmp],%[__tmp],32\n\t" \ + "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \ + "nop\n\t" \ + "nop\n\t" \ + ".option pop\n\t", \ + 0, RISCV_ISA_EXT_ZBA, 1, \ + ".option push\n\t" \ + ".option arch,+zbkb\n\t" \ + "pack %[__ret],%[__ret],%[__tmp]\n\t" \ + "nop\n\t" \ + "nop\n\t" \ + "nop\n\t" \ + ".option pop\n\t", \ + 0, RISCV_ISA_EXT_ZBKB, 1 \ + ) \ + ".option pop\n\t" \ + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \ + ".long 1b - .\n\t" \ + ".popsection" \ + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \ + __ret; \ +}) +#endif + +#ifdef CONFIG_32BIT +#define SRLI "srli " +#else +#define SRLI "srliw " +#endif + +#define runtime_const_shift_right_32(val, sym) \ +({ \ + u32 __ret; \ + asm_inline("1:\t" \ + ".option push\n\t" \ + ".option norvc\n\t" \ + SRLI "%[__ret],%[__val],12\n\t" \ + ".option pop\n\t" \ + ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \ + ".long 1b - .\n\t" \ + ".popsection" \ + : [__ret] "=r" (__ret) \ + : [__val] "r" (val)); \ + __ret; \ +}) + +#define runtime_const_init(type, sym) do { \ + extern s32 __start_runtime_##type##_##sym[]; \ + extern s32 __stop_runtime_##type##_##sym[]; \ + \ + runtime_const_fixup(__runtime_fixup_##type, \ + (unsigned long)(sym), \ + __start_runtime_##type##_##sym, \ + __stop_runtime_##type##_##sym); \ +} while (0) + +static inline void __runtime_fixup_caches(void *where, unsigned int insns) +{ + /* On riscv there are currently only cache-wide flushes so va is ignored. */ + __always_unused uintptr_t va = (uintptr_t)where; + + flush_icache_range(va, va + 4*insns); +} + +/* + * The 32-bit immediate is stored in a lui+addi pairing. + * lui holds the upper 20 bits of the immediate in the first 20 bits of the instruction. + * addi holds the lower 12 bits of the immediate in the first 12 bits of the instruction. + */ +static inline void __runtime_fixup_32(u32 *lui, u32 *addi, unsigned int val) +{ + unsigned int lower_immediate, upper_immediate; + u32 lui_insn = le32_to_cpu(*lui); + u32 addi_insn = le32_to_cpu(*addi); + __le32 addi_res, lui_res; + + lower_immediate = sign_extend32(val, 11); + upper_immediate = (val - lower_immediate); + + if (upper_immediate & 0xfffff000) { + /* replace upper 20 bits of lui with upper immediate */ + lui_insn &= 0x00000fff; + lui_insn |= upper_immediate & 0xfffff000; + } else { + /* replace lui with nop if immediate is small enough to fit in addi */ + lui_insn = 0x00000013; + } + + if (lower_immediate & 0x00000fff) { + /* replace upper 12 bits of addi with lower 12 bits of val */ + addi_insn &= 0x000fffff; + addi_insn |= (lower_immediate & 0x00000fff) << 20; + } else { + /* replace addi with nop if lower_immediate is empty */ + addi_insn = 0x00000013; + } + + addi_res = cpu_to_le32(addi_insn); + lui_res = cpu_to_le32(lui_insn); + patch_insn_write(addi, &addi_res, sizeof(addi_res)); + patch_insn_write(lui, &lui_res, sizeof(lui_res)); +} + +static inline void __runtime_fixup_ptr(void *where, unsigned long val) +{ + if (IS_ENABLED(CONFIG_32BIT)) { + __runtime_fixup_32(where, where + 4, val); + __runtime_fixup_caches(where, 2); + } else { + __runtime_fixup_32(where, where + 8, val); + __runtime_fixup_32(where + 4, where + 12, val >> 32); + __runtime_fixup_caches(where, 4); + } +} + +/* + * Replace the least significant 5 bits of the srli/srliw immediate that is + * located at bits 20-24 + */ +static inline void __runtime_fixup_shift(void *where, unsigned long val) +{ + u32 insn = le32_to_cpu(*(__le32 *)where); + __le32 res; + + insn &= 0xfe0fffff; + insn |= (val & 0b11111) << 20; + + res = cpu_to_le32(insn); + patch_text_nosync(where, &res, sizeof(insn)); +} + +static inline void runtime_const_fixup(void (*fn)(void *, unsigned long), + unsigned long val, s32 *start, s32 *end) +{ + while (start < end) { + fn(*start + (void *)start, val); + start++; + } +} + +#endif /* _ASM_RISCV_RUNTIME_CONST_H */ diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S index 002ca58dd998cb78b662837b5ebac988fb6c77bb..61bd5ba6680a786bf1db7dc37bf1acda0639b5c7 100644 --- a/arch/riscv/kernel/vmlinux.lds.S +++ b/arch/riscv/kernel/vmlinux.lds.S @@ -97,6 +97,9 @@ SECTIONS { EXIT_DATA } + + RUNTIME_CONST_VARIABLES + PERCPU_SECTION(L1_CACHE_BYTES) .rel.dyn : {
Implement the runtime constant infrastructure for riscv. Use this infrastructure to generate constants to be used by the d_hash() function. This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime constant' support") and commit e3c92e81711d ("runtime constants: add x86 architecture support"). Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> --- Ard brought this to my attention in this patch [1]. [1] https://lore.kernel.org/lkml/CAMj1kXE4DJnwFejNWQu784GvyJO=aGNrzuLjSxiowX_e7nW8QA@mail.gmail.com/ --- Changes in v3: - Leverage "pack" instruction for runtime_const_ptr() to reduce hot path by 3 instructions if Zbkb is supported. Suggested by Pasha Bouzarjomehri (pasha@rivosinc.com) - Link to v2: https://lore.kernel.org/r/20250127-runtime_const_riscv-v2-1-95ae7cf97a39@rivosinc.com Changes in v2: - Treat instructions as __le32 and do proper conversions (Ben) - Link to v1: https://lore.kernel.org/r/20250127-runtime_const_riscv-v1-1-795b023ea20b@rivosinc.com --- arch/riscv/include/asm/runtime-const.h | 194 +++++++++++++++++++++++++++++++++ arch/riscv/kernel/vmlinux.lds.S | 3 + 2 files changed, 197 insertions(+) --- base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04 change-id: 20250123-runtime_const_riscv-6cd854ee2817