Message ID | 20230113212351.3534769-1-heiko@sntech.de (mailing list archive) |
---|---|
Headers | show |
Series | Zbb + fast-unaligned string optimization | expand |
On Fri, 13 Jan 2023 13:23:47 PST (-0800), heiko@sntech.de wrote: > From: Heiko Stuebner <heiko.stuebner@vrull.eu> > > This is a follow-up to my Zbb-based string optimization series, that > then adds another strcmp variant for systems with Zbb that also can > do unaligned accesses fast in hardware. > > For this it uses Palmer's series for hw-feature probing that would read > this property from firmware (devicetree), as the performance of unaligned > accesses is an implementation detail of the relevant cpu core. > > > Right now we're still in the middle of discussing how more complex > cpufeature-combinations should be handled in general, so this is more > of a concept on one possible way to do it. Sorry for leaving this dormant for a bit. There's been a lot of discussions and I think the general consensus is to aim at taking these combined workloads only if they are a performance win on real hardware. I think there's no Zbb+fast-unaligned hardware availiable today, but I'm not 100% sure on that. If there is and someone can show benchmarks then I'm happy to fit something like this in somehow, but otherwise I think we should wait and see if this matches what ships. > Dependencies: > - my Zbb string series > https://lore.kernel.org/r/20230113212301.3534711-1-heiko@sntech.de > - Palmer's hw-probing series > https://lore.kernel.org/r/20221013163551.6775-1-palmer@rivosinc.com > > > Heiko Stuebner (4): > RISC-V: use bit-values instead of numbers to identify patched > cpu-features > RISC-V: add alternative-field for bits to not match against > RISC-V: add cpufeature probing for fast-unaligned access > RISC-V: add strcmp variant using zbb and fast-unaligned access > > arch/riscv/include/asm/alternative-macros.h | 64 ++++---- > arch/riscv/include/asm/alternative.h | 1 + > arch/riscv/include/asm/errata_list.h | 27 ++-- > arch/riscv/kernel/cpufeature.c | 33 +++- > arch/riscv/lib/strcmp.S | 170 +++++++++++++++++++- > arch/riscv/lib/strlen.S | 2 +- > arch/riscv/lib/strncmp.S | 2 +- > 7 files changed, 245 insertions(+), 54 deletions(-)
From: Heiko Stuebner <heiko.stuebner@vrull.eu> This is a follow-up to my Zbb-based string optimization series, that then adds another strcmp variant for systems with Zbb that also can do unaligned accesses fast in hardware. For this it uses Palmer's series for hw-feature probing that would read this property from firmware (devicetree), as the performance of unaligned accesses is an implementation detail of the relevant cpu core. Right now we're still in the middle of discussing how more complex cpufeature-combinations should be handled in general, so this is more of a concept on one possible way to do it. Dependencies: - my Zbb string series https://lore.kernel.org/r/20230113212301.3534711-1-heiko@sntech.de - Palmer's hw-probing series https://lore.kernel.org/r/20221013163551.6775-1-palmer@rivosinc.com Heiko Stuebner (4): RISC-V: use bit-values instead of numbers to identify patched cpu-features RISC-V: add alternative-field for bits to not match against RISC-V: add cpufeature probing for fast-unaligned access RISC-V: add strcmp variant using zbb and fast-unaligned access arch/riscv/include/asm/alternative-macros.h | 64 ++++---- arch/riscv/include/asm/alternative.h | 1 + arch/riscv/include/asm/errata_list.h | 27 ++-- arch/riscv/kernel/cpufeature.c | 33 +++- arch/riscv/lib/strcmp.S | 170 +++++++++++++++++++- arch/riscv/lib/strlen.S | 2 +- arch/riscv/lib/strncmp.S | 2 +- 7 files changed, 245 insertions(+), 54 deletions(-)