diff mbox series

riscv, lib: Fix Zbb strncmp

Message ID 20230228184211.1585641-1-bjorn@kernel.org (mailing list archive)
State Accepted
Commit 81a1dd10b072fd432f37cd5dc9eb3ed6ec5d386e
Headers show
Series riscv, lib: Fix Zbb strncmp | expand

Checks

Context Check Description
conchuod/cover_letter success Single patches do not need cover letters
conchuod/tree_selection success Guessed tree name to be for-next
conchuod/fixes_present success Fixes tag not required for -next series
conchuod/maintainers_pattern success MAINTAINERS pattern errors before the patch: 1 and now 1
conchuod/verify_signedoff success Signed-off-by tag matches author and committer
conchuod/kdoc success Errors and warnings before: 0 this patch: 0
conchuod/build_rv64_clang_allmodconfig success Errors and warnings before: 0 this patch: 0
conchuod/module_param success Was 0 now: 0
conchuod/build_rv64_gcc_allmodconfig success Errors and warnings before: 0 this patch: 0
conchuod/alphanumeric_selects success Out of order selects before the patch: 728 and now 728
conchuod/build_rv32_defconfig success Build OK
conchuod/dtb_warn_rv64 success Errors and warnings before: 11 this patch: 11
conchuod/header_inline success No static functions without inline keyword in header files
conchuod/checkpatch warning WARNING: Reported-by: should be immediately followed by Link: with a URL to the report
conchuod/source_inline success Was 0 now: 0
conchuod/build_rv64_nommu_k210_defconfig success Build OK
conchuod/verify_fixes success Fixes tag looks correct
conchuod/build_rv64_nommu_virt_defconfig success Build OK

Commit Message

Björn Töpel Feb. 28, 2023, 6:42 p.m. UTC
From: Björn Töpel <bjorn@rivosinc.com>

The Zbb optimized strncmp has two parts; a fast path that does XLEN/8B
per iteration, and a slow that does one byte per iteration.

The idea is to compare aligned XLEN chunks for most of strings, and do
the remainder tail in the slow path.

The Zbb strncmp has two issues in the fast path:

Incorrect remainder handling (wrong compare): Assume that the string
length is 9. On 64b systems, the fast path should do one iteration,
and one iteration in the slow path. Instead, both were done in the
fast path, which lead to incorrect results. An example:

  strncmp("/dev/vda", "/dev/", 5);

Correct by changing "bgt" to "bge".

Missing NULL checks in the second string: This could lead to incorrect
results for:

  strncmp("/dev/vda", "/dev/vda\0", 8);

Correct by adding an additional check.

Fixes: b6fcdb191e36 ("RISC-V: add zbb support to string functions")
Suggested-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
---
 arch/riscv/lib/strncmp.S | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


base-commit: eb9be8310c58c166f9fae3b71c0ad9d6741b4897

Comments

Conor Dooley Feb. 28, 2023, 7:12 p.m. UTC | #1
On Tue, Feb 28, 2023 at 07:42:10PM +0100, Björn Töpel wrote:
> From: Björn Töpel <bjorn@rivosinc.com>
> 
> The Zbb optimized strncmp has two parts; a fast path that does XLEN/8B
> per iteration, and a slow that does one byte per iteration.
> 
> The idea is to compare aligned XLEN chunks for most of strings, and do
> the remainder tail in the slow path.
> 
> The Zbb strncmp has two issues in the fast path:
> 
> Incorrect remainder handling (wrong compare): Assume that the string
> length is 9. On 64b systems, the fast path should do one iteration,
> and one iteration in the slow path. Instead, both were done in the
> fast path, which lead to incorrect results. An example:
> 
>   strncmp("/dev/vda", "/dev/", 5);
> 
> Correct by changing "bgt" to "bge".
> 
> Missing NULL checks in the second string: This could lead to incorrect
> results for:
> 
>   strncmp("/dev/vda", "/dev/vda\0", 8);
> 
> Correct by adding an additional check.
> 
> Fixes: b6fcdb191e36 ("RISC-V: add zbb support to string functions")
> Suggested-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

With your new selftest stuff & on a JH7110:
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Also, given the reporter, this should be:
Reported-by: Guenter Roeck <linux@roeck-us.net>
no?

Thanks,
Conor.
Björn Töpel Feb. 28, 2023, 7:41 p.m. UTC | #2
Conor Dooley <conor@kernel.org> writes:

> On Tue, Feb 28, 2023 at 07:42:10PM +0100, Björn Töpel wrote:
>> From: Björn Töpel <bjorn@rivosinc.com>
>> 
>> The Zbb optimized strncmp has two parts; a fast path that does XLEN/8B
>> per iteration, and a slow that does one byte per iteration.
>> 
>> The idea is to compare aligned XLEN chunks for most of strings, and do
>> the remainder tail in the slow path.
>> 
>> The Zbb strncmp has two issues in the fast path:
>> 
>> Incorrect remainder handling (wrong compare): Assume that the string
>> length is 9. On 64b systems, the fast path should do one iteration,
>> and one iteration in the slow path. Instead, both were done in the
>> fast path, which lead to incorrect results. An example:
>> 
>>   strncmp("/dev/vda", "/dev/", 5);
>> 
>> Correct by changing "bgt" to "bge".
>> 
>> Missing NULL checks in the second string: This could lead to incorrect
>> results for:
>> 
>>   strncmp("/dev/vda", "/dev/vda\0", 8);
>> 
>> Correct by adding an additional check.
>> 
>> Fixes: b6fcdb191e36 ("RISC-V: add zbb support to string functions")
>> Suggested-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
>
> With your new selftest stuff & on a JH7110:
> Tested-by: Conor Dooley <conor.dooley@microchip.com>
> Also, given the reporter, this should be:
> Reported-by: Guenter Roeck <linux@roeck-us.net>
> no?

Indeed. Sorry, Guenter, for missing that!
Guenter Roeck Feb. 28, 2023, 7:53 p.m. UTC | #3
On 2/28/23 11:12, Conor Dooley wrote:
> On Tue, Feb 28, 2023 at 07:42:10PM +0100, Björn Töpel wrote:
>> From: Björn Töpel <bjorn@rivosinc.com>
>>
>> The Zbb optimized strncmp has two parts; a fast path that does XLEN/8B
>> per iteration, and a slow that does one byte per iteration.
>>
>> The idea is to compare aligned XLEN chunks for most of strings, and do
>> the remainder tail in the slow path.
>>
>> The Zbb strncmp has two issues in the fast path:
>>
>> Incorrect remainder handling (wrong compare): Assume that the string
>> length is 9. On 64b systems, the fast path should do one iteration,
>> and one iteration in the slow path. Instead, both were done in the
>> fast path, which lead to incorrect results. An example:
>>
>>    strncmp("/dev/vda", "/dev/", 5);
>>
>> Correct by changing "bgt" to "bge".
>>
>> Missing NULL checks in the second string: This could lead to incorrect
>> results for:
>>
>>    strncmp("/dev/vda", "/dev/vda\0", 8);
>>
>> Correct by adding an additional check.
>>
>> Fixes: b6fcdb191e36 ("RISC-V: add zbb support to string functions")
>> Suggested-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
> 
> With your new selftest stuff & on a JH7110:
> Tested-by: Conor Dooley <conor.dooley@microchip.com>
> Also, given the reporter, this should be:
> Reported-by: Guenter Roeck <linux@roeck-us.net>
> no?
> 

No worries. I'll give the patch a try.

Guenter
Guenter Roeck Feb. 28, 2023, 9:18 p.m. UTC | #4
On 2/28/23 10:42, Björn Töpel wrote:
> From: Björn Töpel <bjorn@rivosinc.com>
> 
> The Zbb optimized strncmp has two parts; a fast path that does XLEN/8B
> per iteration, and a slow that does one byte per iteration.
> 
> The idea is to compare aligned XLEN chunks for most of strings, and do
> the remainder tail in the slow path.
> 
> The Zbb strncmp has two issues in the fast path:
> 
> Incorrect remainder handling (wrong compare): Assume that the string
> length is 9. On 64b systems, the fast path should do one iteration,
> and one iteration in the slow path. Instead, both were done in the
> fast path, which lead to incorrect results. An example:
> 
>    strncmp("/dev/vda", "/dev/", 5);
> 
> Correct by changing "bgt" to "bge".
> 
> Missing NULL checks in the second string: This could lead to incorrect
> results for:
> 
>    strncmp("/dev/vda", "/dev/vda\0", 8);
> 
> Correct by adding an additional check.
> 
> Fixes: b6fcdb191e36 ("RISC-V: add zbb support to string functions")
> Suggested-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Tested-by: Guenter Roeck <linux@roeck-us.net>

> ---
>   arch/riscv/lib/strncmp.S | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
> index ee49595075be..efd3f3150a54 100644
> --- a/arch/riscv/lib/strncmp.S
> +++ b/arch/riscv/lib/strncmp.S
> @@ -78,11 +78,13 @@ strncmp_zbb:
>   	/* Main loop for aligned string.  */
>   	.p2align 3
>   1:
> -	bgt	a0, t6, 3f
> +	bge	a0, t6, 3f
>   	REG_L	t0, 0(a0)
>   	REG_L	t1, 0(a1)
>   	orc.b	t3, t0
>   	bne	t3, t5, 2f
> +	orc.b	t3, t1
> +	bne	t3, t5, 2f
>   	addi	a0, a0, SZREG
>   	addi	a1, a1, SZREG
>   	beq	t0, t1, 1b
> 
> base-commit: eb9be8310c58c166f9fae3b71c0ad9d6741b4897
Palmer Dabbelt Feb. 28, 2023, 9:55 p.m. UTC | #5
On Tue, 28 Feb 2023 13:18:12 PST (-0800), linux@roeck-us.net wrote:
> On 2/28/23 10:42, Björn Töpel wrote:
>> From: Björn Töpel <bjorn@rivosinc.com>
>>
>> The Zbb optimized strncmp has two parts; a fast path that does XLEN/8B
>> per iteration, and a slow that does one byte per iteration.
>>
>> The idea is to compare aligned XLEN chunks for most of strings, and do
>> the remainder tail in the slow path.
>>
>> The Zbb strncmp has two issues in the fast path:
>>
>> Incorrect remainder handling (wrong compare): Assume that the string
>> length is 9. On 64b systems, the fast path should do one iteration,
>> and one iteration in the slow path. Instead, both were done in the
>> fast path, which lead to incorrect results. An example:
>>
>>    strncmp("/dev/vda", "/dev/", 5);
>>
>> Correct by changing "bgt" to "bge".
>>
>> Missing NULL checks in the second string: This could lead to incorrect
>> results for:
>>
>>    strncmp("/dev/vda", "/dev/vda\0", 8);
>>
>> Correct by adding an additional check.
>>
>> Fixes: b6fcdb191e36 ("RISC-V: add zbb support to string functions")
>> Suggested-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
>
> Tested-by: Guenter Roeck <linux@roeck-us.net>

Thanks guys, I'm going to apply this ASAP as it's a pretty concrete 
breakage in new code from this merge window.  I just stuck it in my 
staging branch, it should show up on for-next after the next batch of 
tests.

>
>> ---
>>   arch/riscv/lib/strncmp.S | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
>> index ee49595075be..efd3f3150a54 100644
>> --- a/arch/riscv/lib/strncmp.S
>> +++ b/arch/riscv/lib/strncmp.S
>> @@ -78,11 +78,13 @@ strncmp_zbb:
>>   	/* Main loop for aligned string.  */
>>   	.p2align 3
>>   1:
>> -	bgt	a0, t6, 3f
>> +	bge	a0, t6, 3f
>>   	REG_L	t0, 0(a0)
>>   	REG_L	t1, 0(a1)
>>   	orc.b	t3, t0
>>   	bne	t3, t5, 2f
>> +	orc.b	t3, t1
>> +	bne	t3, t5, 2f
>>   	addi	a0, a0, SZREG
>>   	addi	a1, a1, SZREG
>>   	beq	t0, t1, 1b
>>
>> base-commit: eb9be8310c58c166f9fae3b71c0ad9d6741b4897
patchwork-bot+linux-riscv@kernel.org March 1, 2023, 3:20 a.m. UTC | #6
Hello:

This patch was applied to riscv/linux.git (for-next)
by Palmer Dabbelt <palmer@rivosinc.com>:

On Tue, 28 Feb 2023 19:42:10 +0100 you wrote:
> From: Björn Töpel <bjorn@rivosinc.com>
> 
> The Zbb optimized strncmp has two parts; a fast path that does XLEN/8B
> per iteration, and a slow that does one byte per iteration.
> 
> The idea is to compare aligned XLEN chunks for most of strings, and do
> the remainder tail in the slow path.
> 
> [...]

Here is the summary with links:
  - riscv, lib: Fix Zbb strncmp
    https://git.kernel.org/riscv/c/81a1dd10b072

You are awesome, thank you!
diff mbox series

Patch

diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
index ee49595075be..efd3f3150a54 100644
--- a/arch/riscv/lib/strncmp.S
+++ b/arch/riscv/lib/strncmp.S
@@ -78,11 +78,13 @@  strncmp_zbb:
 	/* Main loop for aligned string.  */
 	.p2align 3
 1:
-	bgt	a0, t6, 3f
+	bge	a0, t6, 3f
 	REG_L	t0, 0(a0)
 	REG_L	t1, 0(a1)
 	orc.b	t3, t0
 	bne	t3, t5, 2f
+	orc.b	t3, t1
+	bne	t3, t5, 2f
 	addi	a0, a0, SZREG
 	addi	a1, a1, SZREG
 	beq	t0, t1, 1b