Message ID | 20230113212301.3534711-1-heiko@sntech.de (mailing list archive) |
---|---|
Headers | show |
Series | Zbb string optimizations | expand |
On Fri, 13 Jan 2023 22:22:59 +0100, Heiko Stuebner wrote: > From: Heiko Stuebner <heiko.stuebner@vrull.eu> > > This series still tries to allow optimized string functions for specific > extensions. The last approach of using an inline base function to hold > the alternative calls did cause some issues in a number of places > > So instead of that we're now just using an alternative j at the beginning > of the generic function to jump to a separate place inside the function > itself. > > [...] Applied, thanks! [1/2] RISC-V: add infrastructure to allow different str* implementations https://git.kernel.org/palmer/c/56e0790c7f9e [2/2] RISC-V: add zbb support to string functions https://git.kernel.org/palmer/c/b6fcdb191e36 Best regards,
Hello: This series was applied to riscv/linux.git (for-next) by Palmer Dabbelt <palmer@rivosinc.com>: On Fri, 13 Jan 2023 22:22:59 +0100 you wrote: > From: Heiko Stuebner <heiko.stuebner@vrull.eu> > > This series still tries to allow optimized string functions for specific > extensions. The last approach of using an inline base function to hold > the alternative calls did cause some issues in a number of places > > So instead of that we're now just using an alternative j at the beginning > of the generic function to jump to a separate place inside the function > itself. > > [...] Here is the summary with links: - [v5,1/2] RISC-V: add infrastructure to allow different str* implementations https://git.kernel.org/riscv/c/56e0790c7f9e - [v5,2/2] RISC-V: add zbb support to string functions https://git.kernel.org/riscv/c/b6fcdb191e36 You are awesome, thank you!
On Thu, Feb 02, 2023 at 03:39:09PM -0800, Palmer Dabbelt wrote: > On Fri, 13 Jan 2023 22:22:59 +0100, Heiko Stuebner wrote: > > From: Heiko Stuebner <heiko.stuebner@vrull.eu> > > > > This series still tries to allow optimized string functions for specific > > extensions. The last approach of using an inline base function to hold > > the alternative calls did cause some issues in a number of places > > > > So instead of that we're now just using an alternative j at the beginning > > of the generic function to jump to a separate place inside the function > > itself. > > > > [...] > > Applied, thanks! > > [1/2] RISC-V: add infrastructure to allow different str* implementations > https://git.kernel.org/palmer/c/56e0790c7f9e > [2/2] RISC-V: add zbb support to string functions > https://git.kernel.org/palmer/c/b6fcdb191e36 I had a few more comments on this version which I was hoping Heiko would respin for. Some where just nits, but the extra labels (which don't hurt, but are confusing) and the s/words/bytes/ comment change, which also doesn't affect functionality, but causes confusion, would be good pick up. With those changes, I would have given it an r-b. Thanks, drew
From: Heiko Stuebner <heiko.stuebner@vrull.eu> This series still tries to allow optimized string functions for specific extensions. The last approach of using an inline base function to hold the alternative calls did cause some issues in a number of places So instead of that we're now just using an alternative j at the beginning of the generic function to jump to a separate place inside the function itself. This of course needs a fixup for "j" instructions in alternative blocks, so that is provided here as well. Technically patch4 got a review from Andrew, but that was still with the inline approach, so I didn't bring it over to v4. Dependencies: - 6.2-rc1 - the series adding the call address fixes to alternatives (in Palmer's next) https://lore.kernel.org/r/20221223221332.4127602-1-heiko@sntech.de - the patch adding the j address fix to alternatives https://lore.kernel.org/r/20230113212205.3534622-1-heiko@sntech.de - Conor's isa extension reordering series https://lore.kernel.org/r/20221205144525.2148448-1-conor.dooley@microchip.com changes since v4: - split out the now shared alternatives j-address-fixup - split out the somewhat unrelated isn-func move patch - follow Andrew's great suggestions on making the assembly nicer - put the Zbb extension into the correct place after Conor's reordering changes since v3: - rebase on top of 6.2-rc1 + the applied alternative-call series - add alternative fixup for jal instructions - drop the inline functions and instead just jump changes since v2: - add patch fixing the c.jalr funct4 value - reword some commit messages - fix position of auipc addition patch (earlier) - fix compile errors from patch-reordering gone wrong (worked at the end of v2, but compiling individual patches caused issues) - patches are now tested individually - limit Zbb variants for GNU as for now (LLVM support for .option arch is still under review) - prevent str-functions from getting optimized to builtin-variants changes since v1: - a number of generalizations/cleanups for instruction parsing - use accessor function to access instructions (Emil) - actually patch the correct location when having more than one instruction in an alternative block - string function cleanups (comments etc) (Conor) - move zbb extension above s* extensions in cpu.c lists changes since rfc: - make Zbb code actually work - drop some unneeded patches - a lot of cleanups Heiko Stuebner (2): RISC-V: add infrastructure to allow different str* implementations RISC-V: add zbb support to string functions arch/riscv/Kconfig | 24 +++++ arch/riscv/include/asm/errata_list.h | 3 +- arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/include/asm/string.h | 10 ++ arch/riscv/kernel/cpu.c | 1 + arch/riscv/kernel/cpufeature.c | 18 ++++ arch/riscv/kernel/riscv_ksyms.c | 3 + arch/riscv/lib/Makefile | 3 + arch/riscv/lib/strcmp.S | 121 +++++++++++++++++++++++ arch/riscv/lib/strlen.S | 133 +++++++++++++++++++++++++ arch/riscv/lib/strncmp.S | 139 +++++++++++++++++++++++++++ arch/riscv/purgatory/Makefile | 13 +++ 12 files changed, 468 insertions(+), 1 deletion(-) create mode 100644 arch/riscv/lib/strcmp.S create mode 100644 arch/riscv/lib/strlen.S create mode 100644 arch/riscv/lib/strncmp.S