Message ID | 20210617152754.17960-1-mcroce@linux.microsoft.com (mailing list archive) |
---|---|
Headers | show |
Series | riscv: optimized mem* functions | expand |
Hello Matteo, Στις 2021-06-17 18:27, Matteo Croce έγραψε: > From: Matteo Croce <mcroce@microsoft.com> > > Replace the assembly mem{cpy,move,set} with C equivalent. > > Try to access RAM with the largest bit width possible, but without > doing unaligned accesses. > > Tested on a BeagleV Starlight with a SiFive U74 core, where the > improvement is noticeable. > There are already generic C implementations for memcpy/memmove/memset at https://elixir.bootlin.com/linux/v5.13-rc7/source/lib/string.c#L871 but are doing one byte at a time, I suggest you update them to do word-by-word copy instead of introducing yet another memcpy/memmove C implementation on arch/riscv/.
On Tue, Jun 22, 2021 at 9:09 AM Nick Kossifidis <mick@ics.forth.gr> wrote: > > Hello Matteo, > > Στις 2021-06-17 18:27, Matteo Croce έγραψε: > > From: Matteo Croce <mcroce@microsoft.com> > > > > Replace the assembly mem{cpy,move,set} with C equivalent. > > > > Try to access RAM with the largest bit width possible, but without > > doing unaligned accesses. > > > > Tested on a BeagleV Starlight with a SiFive U74 core, where the > > improvement is noticeable. > > > > There are already generic C implementations for memcpy/memmove/memset at > https://elixir.bootlin.com/linux/v5.13-rc7/source/lib/string.c#L871 but > are doing one byte at a time, I suggest you update them to do > word-by-word copy instead of introducing yet another memcpy/memmove C > implementation on arch/riscv/. Yes, I've tried to copy the Glibc version into arch/csky/abiv1, and Arnd suggested putting them into generic. ref: https://lore.kernel.org/linux-arch/20190629053641.3iBfk9-I_D29cDp9yJnIdIg7oMtHNZlDmhLQPTumhEc@z/#t > > > -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/
From: Matteo Croce <mcroce@microsoft.com> Replace the assembly mem{cpy,move,set} with C equivalent. Try to access RAM with the largest bit width possible, but without doing unaligned accesses. Tested on a BeagleV Starlight with a SiFive U74 core, where the improvement is noticeable. v2 -> v3: - alias mem* to __mem* and not viceversa - use __alias instead of a tail call v1 -> v2: - reduce the threshold from 64 to 16 bytes - fix KASAN build - optimize memset Matteo Croce (3): riscv: optimized memcpy riscv: optimized memmove riscv: optimized memset arch/riscv/include/asm/string.h | 18 ++-- arch/riscv/kernel/Makefile | 1 - arch/riscv/kernel/riscv_ksyms.c | 17 ---- arch/riscv/lib/Makefile | 4 +- arch/riscv/lib/memcpy.S | 108 ---------------------- arch/riscv/lib/memmove.S | 64 ------------- arch/riscv/lib/memset.S | 113 ----------------------- arch/riscv/lib/string.c | 153 ++++++++++++++++++++++++++++++++ 8 files changed, 163 insertions(+), 315 deletions(-) delete mode 100644 arch/riscv/kernel/riscv_ksyms.c delete mode 100644 arch/riscv/lib/memcpy.S delete mode 100644 arch/riscv/lib/memmove.S delete mode 100644 arch/riscv/lib/memset.S create mode 100644 arch/riscv/lib/string.c