mbox series

[0/1] __asm_copy_to-from_user: Reduce more byte_copy

Message ID 65f08f01-d4ce-75c2-030b-f8759003e061@gmail.com (mailing list archive)
Headers show
Series __asm_copy_to-from_user: Reduce more byte_copy | expand

Message

Akira Tsukamoto July 30, 2021, 1:50 p.m. UTC
Adding none unrolling word_copy, which is used if the size is smaller
than 9*SZREG.

This patch is based on Palmer's past comment.
> My guess is that some workloads will want some smaller unrolling factors,

It will reduce the number of slow byte_copy being used when the 
size is small.

Have tested on qemu rv32, qemu rv64 and beaglev beta board.

In the future, I am planning to convert uaccess.S to inline assembly 
in .c file. Then it will be easier to optimize on both in-order core and
out-of-order core with #ifdef macro in c.

Akira Tsukamoto (1):
  riscv: __asm_copy_to-from_user: Improve using word copy if size <
    9*SZREG

 arch/riscv/lib/uaccess.S | 46 ++++++++++++++++++++++++++++++++++++----
 1 file changed, 42 insertions(+), 4 deletions(-)

Comments

Akira Tsukamoto Aug. 12, 2021, 11:01 a.m. UTC | #1
Hi Guenter, Geert and Qiu,

Would you mind testing this patch?
Thanks,

Akira

On 7/30/2021 10:50 PM, Akira Tsukamoto wrote:
> Adding none unrolling word_copy, which is used if the size is smaller
> than 9*SZREG.
> 
> This patch is based on Palmer's past comment.
>> My guess is that some workloads will want some smaller unrolling factors,
> 
> It will reduce the number of slow byte_copy being used when the 
> size is small.
> 
> Have tested on qemu rv32, qemu rv64 and beaglev beta board.
> 
> In the future, I am planning to convert uaccess.S to inline assembly 
> in .c file. Then it will be easier to optimize on both in-order core and
> out-of-order core with #ifdef macro in c.
> 
> Akira Tsukamoto (1):
>   riscv: __asm_copy_to-from_user: Improve using word copy if size <
>     9*SZREG
> 
>  arch/riscv/lib/uaccess.S | 46 ++++++++++++++++++++++++++++++++++++----
>  1 file changed, 42 insertions(+), 4 deletions(-)
>
Qiu Wenbo Aug. 15, 2021, 2:30 a.m. UTC | #2
Hi Akira,


This patch breaks my userspace  and I can't boot my system after 
applying this. Here is the stack trace:


[   10.349080] Unable to handle kernel NULL pointer dereference at 
virtual address 0000000000000000
[   10.357116] Oops [#15]
[   10.359433] CPU: 2 PID: 169 Comm: (networkd) Tainted: G D           
5.14.0-rc5 #53
[   10.367422] Hardware name: SiFive HiFive Unmatched A00 (DT)
[   10.372981] epc : __asm_copy_from_user+0x48/0xf0
[   10.377584]  ra : _copy_from_user+0x28/0x68
[   10.381754] epc : ffffffff8099a280 ra : ffffffff803614a8 sp : 
ffffffd00416bd90
[   10.388963]  gp : ffffffff811ee540 tp : ffffffe0841b3680 t0 : 
ffffffd00416bde0
[   10.396172]  t1 : ffffffd00416bdd8 t2 : 0000003ff09ca3a0 s0 : 
ffffffd00416bdc0
[   10.403381]  s1 : 0000000000000000 a0 : ffffffd00416bdd8 a1 : 
0000000000000000
[   10.410590]  a2 : 0000000000000010 a3 : 0000000000000040 a4 : 
ffffffd00416be18
[   10.417800]  a5 : 0000003ffffffff0 a6 : 000000000000000f a7 : 
ffffffe085d58540
[   10.425009]  s2 : 0000000000000010 s3 : ffffffd00416bdd8 s4 : 
0000000000000002
[   10.432218]  s5 : 0000000000000000 s6 : 0000000000000000 s7 : 
ffffffe0841b3680
[   10.439427]  s8 : 0000002aad788040 s9 : 0000000000000000 s10: 
0000000000000001
[   10.446636]  s11: 0000000000000000 t3 : 0000000000000000 t4 : 
0000000000000001
[   10.453845]  t5 : 0000000000000010 t6 : 0000000000040000
[   10.459144] status: 0000000200040120 badaddr: 0000000000000000 cause: 
000000000000000d
[   10.467049] [<ffffffff8099a280>] __asm_copy_from_user+0x48/0xf0
[   10.472955] [<ffffffff8009a562>] do_seccomp+0x62/0x8be
[   10.478079] [<ffffffff8009af58>] prctl_set_seccomp+0x24/0x32
[   10.483725] [<ffffffff80020756>] sys_prctl+0xf6/0x450
[   10.488763] [<ffffffff800034f2>] ret_from_syscall+0x0/0x2


The PC register points to this line:

+1:
+	REG_L	a5, 0(a1)


Qiu


On 8/12/21 7:01 PM, Akira Tsukamoto wrote:
> Hi Guenter, Geert and Qiu,
>
> Would you mind testing this patch?
> Thanks,
>
> Akira
>
> On 7/30/2021 10:50 PM, Akira Tsukamoto wrote:
>> Adding none unrolling word_copy, which is used if the size is smaller
>> than 9*SZREG.
>>
>> This patch is based on Palmer's past comment.
>>> My guess is that some workloads will want some smaller unrolling factors,
>> It will reduce the number of slow byte_copy being used when the
>> size is small.
>>
>> Have tested on qemu rv32, qemu rv64 and beaglev beta board.
>>
>> In the future, I am planning to convert uaccess.S to inline assembly
>> in .c file. Then it will be easier to optimize on both in-order core and
>> out-of-order core with #ifdef macro in c.
>>
>> Akira Tsukamoto (1):
>>    riscv: __asm_copy_to-from_user: Improve using word copy if size <
>>      9*SZREG
>>
>>   arch/riscv/lib/uaccess.S | 46 ++++++++++++++++++++++++++++++++++++----
>>   1 file changed, 42 insertions(+), 4 deletions(-)
>>
Akira Tsukamoto Aug. 16, 2021, 6:24 a.m. UTC | #3
Hi Qiu,

On 8/15/2021 11:30 AM, Qiu Wenbo wrote:
> Hi Akira,
> 
> 
> This patch breaks my userspace  and I can't boot my system after applying this. Here is the stack trace:
> 
> 
> [   10.349080] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> [   10.357116] Oops [#15]
> [   10.359433] CPU: 2 PID: 169 Comm: (networkd) Tainted: G D           5.14.0-rc5 #53
> [   10.367422] Hardware name: SiFive HiFive Unmatched A00 (DT)
> [   10.372981] epc : __asm_copy_from_user+0x48/0xf0
> [   10.377584]  ra : _copy_from_user+0x28/0x68
> [   10.381754] epc : ffffffff8099a280 ra : ffffffff803614a8 sp : ffffffd00416bd90
> [   10.388963]  gp : ffffffff811ee540 tp : ffffffe0841b3680 t0 : ffffffd00416bde0
> [   10.396172]  t1 : ffffffd00416bdd8 t2 : 0000003ff09ca3a0 s0 : ffffffd00416bdc0
> [   10.403381]  s1 : 0000000000000000 a0 : ffffffd00416bdd8 a1 : 0000000000000000
> [   10.410590]  a2 : 0000000000000010 a3 : 0000000000000040 a4 : ffffffd00416be18
> [   10.417800]  a5 : 0000003ffffffff0 a6 : 000000000000000f a7 : ffffffe085d58540
> [   10.425009]  s2 : 0000000000000010 s3 : ffffffd00416bdd8 s4 : 0000000000000002
> [   10.432218]  s5 : 0000000000000000 s6 : 0000000000000000 s7 : ffffffe0841b3680
> [   10.439427]  s8 : 0000002aad788040 s9 : 0000000000000000 s10: 0000000000000001
> [   10.446636]  s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000001
> [   10.453845]  t5 : 0000000000000010 t6 : 0000000000040000
> [   10.459144] status: 0000000200040120 badaddr: 0000000000000000 cause: 000000000000000d
> [   10.467049] [<ffffffff8099a280>] __asm_copy_from_user+0x48/0xf0
> [   10.472955] [<ffffffff8009a562>] do_seccomp+0x62/0x8be
> [   10.478079] [<ffffffff8009af58>] prctl_set_seccomp+0x24/0x32
> [   10.483725] [<ffffffff80020756>] sys_prctl+0xf6/0x450
> [   10.488763] [<ffffffff800034f2>] ret_from_syscall+0x0/0x2
> 
> 
> The PC register points to this line:
> 
> +1:
> +    REG_L    a5, 0(a1)

Thanks for testing! Do you mind teaching me how to reproduce the error?

Akira
Qiu Wenbo Aug. 16, 2021, 9:45 a.m. UTC | #4
Hi Akira,


I can reproduce it on my HiFive Unmatched with a custom Gentoo rootfs. 
As pointed out by Andreas, there might be a missing fixup.  I'm going to 
debug this issue myself since I can reproduce it fairly stable.


Qiu


On 8/16/21 14:24, Akira Tsukamoto wrote:
> Hi Qiu,
>
> On 8/15/2021 11:30 AM, Qiu Wenbo wrote:
>> Hi Akira,
>>
>>
>> This patch breaks my userspace  and I can't boot my system after applying this. Here is the stack trace:
>>
>>
>> [   10.349080] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>> [   10.357116] Oops [#15]
>> [   10.359433] CPU: 2 PID: 169 Comm: (networkd) Tainted: G D           5.14.0-rc5 #53
>> [   10.367422] Hardware name: SiFive HiFive Unmatched A00 (DT)
>> [   10.372981] epc : __asm_copy_from_user+0x48/0xf0
>> [   10.377584]  ra : _copy_from_user+0x28/0x68
>> [   10.381754] epc : ffffffff8099a280 ra : ffffffff803614a8 sp : ffffffd00416bd90
>> [   10.388963]  gp : ffffffff811ee540 tp : ffffffe0841b3680 t0 : ffffffd00416bde0
>> [   10.396172]  t1 : ffffffd00416bdd8 t2 : 0000003ff09ca3a0 s0 : ffffffd00416bdc0
>> [   10.403381]  s1 : 0000000000000000 a0 : ffffffd00416bdd8 a1 : 0000000000000000
>> [   10.410590]  a2 : 0000000000000010 a3 : 0000000000000040 a4 : ffffffd00416be18
>> [   10.417800]  a5 : 0000003ffffffff0 a6 : 000000000000000f a7 : ffffffe085d58540
>> [   10.425009]  s2 : 0000000000000010 s3 : ffffffd00416bdd8 s4 : 0000000000000002
>> [   10.432218]  s5 : 0000000000000000 s6 : 0000000000000000 s7 : ffffffe0841b3680
>> [   10.439427]  s8 : 0000002aad788040 s9 : 0000000000000000 s10: 0000000000000001
>> [   10.446636]  s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000001
>> [   10.453845]  t5 : 0000000000000010 t6 : 0000000000040000
>> [   10.459144] status: 0000000200040120 badaddr: 0000000000000000 cause: 000000000000000d
>> [   10.467049] [<ffffffff8099a280>] __asm_copy_from_user+0x48/0xf0
>> [   10.472955] [<ffffffff8009a562>] do_seccomp+0x62/0x8be
>> [   10.478079] [<ffffffff8009af58>] prctl_set_seccomp+0x24/0x32
>> [   10.483725] [<ffffffff80020756>] sys_prctl+0xf6/0x450
>> [   10.488763] [<ffffffff800034f2>] ret_from_syscall+0x0/0x2
>>
>>
>> The PC register points to this line:
>>
>> +1:
>> +    REG_L    a5, 0(a1)
> Thanks for testing! Do you mind teaching me how to reproduce the error?
>
> Akira
>
Akira Tsukamoto Aug. 17, 2021, 7:32 a.m. UTC | #5
Hi Qiu,


On 8/16/2021 6:45 PM, Qiu Wenbo wrote:
> Hi Akira,
> 
> 
> I can reproduce it on my HiFive Unmatched with a custom Gentoo rootfs. As pointed out by Andreas, there might be a missing fixup.  I'm going to debug this issue myself since I can reproduce it fairly stable.

Ah! Now I understand the bug.

> +    REG_L    a5, 0(a1)

should be 

+    fixup REG_L    a5, 0(a1)

If you do not mind, could you make the patch to add 'fixup' to all REG_S and REG_L?
Then I will resubmit them to Palmer with your patch.

Thanks,

Akira

> 
> 
> Qiu
> 
> 
> On 8/16/21 14:24, Akira Tsukamoto wrote:
>> Hi Qiu,
>>
>> On 8/15/2021 11:30 AM, Qiu Wenbo wrote:
>>> Hi Akira,
>>>
>>>
>>> This patch breaks my userspace  and I can't boot my system after applying this. Here is the stack trace:
>>>
>>>
>>> [   10.349080] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>>> [   10.357116] Oops [#15]
>>> [   10.359433] CPU: 2 PID: 169 Comm: (networkd) Tainted: G D           5.14.0-rc5 #53
>>> [   10.367422] Hardware name: SiFive HiFive Unmatched A00 (DT)
>>> [   10.372981] epc : __asm_copy_from_user+0x48/0xf0
>>> [   10.377584]  ra : _copy_from_user+0x28/0x68
>>> [   10.381754] epc : ffffffff8099a280 ra : ffffffff803614a8 sp : ffffffd00416bd90
>>> [   10.388963]  gp : ffffffff811ee540 tp : ffffffe0841b3680 t0 : ffffffd00416bde0
>>> [   10.396172]  t1 : ffffffd00416bdd8 t2 : 0000003ff09ca3a0 s0 : ffffffd00416bdc0
>>> [   10.403381]  s1 : 0000000000000000 a0 : ffffffd00416bdd8 a1 : 0000000000000000
>>> [   10.410590]  a2 : 0000000000000010 a3 : 0000000000000040 a4 : ffffffd00416be18
>>> [   10.417800]  a5 : 0000003ffffffff0 a6 : 000000000000000f a7 : ffffffe085d58540
>>> [   10.425009]  s2 : 0000000000000010 s3 : ffffffd00416bdd8 s4 : 0000000000000002
>>> [   10.432218]  s5 : 0000000000000000 s6 : 0000000000000000 s7 : ffffffe0841b3680
>>> [   10.439427]  s8 : 0000002aad788040 s9 : 0000000000000000 s10: 0000000000000001
>>> [   10.446636]  s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000001
>>> [   10.453845]  t5 : 0000000000000010 t6 : 0000000000040000
>>> [   10.459144] status: 0000000200040120 badaddr: 0000000000000000 cause: 000000000000000d
>>> [   10.467049] [<ffffffff8099a280>] __asm_copy_from_user+0x48/0xf0
>>> [   10.472955] [<ffffffff8009a562>] do_seccomp+0x62/0x8be
>>> [   10.478079] [<ffffffff8009af58>] prctl_set_seccomp+0x24/0x32
>>> [   10.483725] [<ffffffff80020756>] sys_prctl+0xf6/0x450
>>> [   10.488763] [<ffffffff800034f2>] ret_from_syscall+0x0/0x2
>>>
>>>
>>> The PC register points to this line:
>>>
>>> +1:
>>> +    REG_L    a5, 0(a1)
>> Thanks for testing! Do you mind teaching me how to reproduce the error?
>>
>> Akira
>>
> 
> 
>