Message ID | 65f08f01-d4ce-75c2-030b-f8759003e061@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | __asm_copy_to-from_user: Reduce more byte_copy | expand |
Hi Guenter, Geert and Qiu, Would you mind testing this patch? Thanks, Akira On 7/30/2021 10:50 PM, Akira Tsukamoto wrote: > Adding none unrolling word_copy, which is used if the size is smaller > than 9*SZREG. > > This patch is based on Palmer's past comment. >> My guess is that some workloads will want some smaller unrolling factors, > > It will reduce the number of slow byte_copy being used when the > size is small. > > Have tested on qemu rv32, qemu rv64 and beaglev beta board. > > In the future, I am planning to convert uaccess.S to inline assembly > in .c file. Then it will be easier to optimize on both in-order core and > out-of-order core with #ifdef macro in c. > > Akira Tsukamoto (1): > riscv: __asm_copy_to-from_user: Improve using word copy if size < > 9*SZREG > > arch/riscv/lib/uaccess.S | 46 ++++++++++++++++++++++++++++++++++++---- > 1 file changed, 42 insertions(+), 4 deletions(-) >
Hi Akira, This patch breaks my userspace and I can't boot my system after applying this. Here is the stack trace: [ 10.349080] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 10.357116] Oops [#15] [ 10.359433] CPU: 2 PID: 169 Comm: (networkd) Tainted: G D 5.14.0-rc5 #53 [ 10.367422] Hardware name: SiFive HiFive Unmatched A00 (DT) [ 10.372981] epc : __asm_copy_from_user+0x48/0xf0 [ 10.377584] ra : _copy_from_user+0x28/0x68 [ 10.381754] epc : ffffffff8099a280 ra : ffffffff803614a8 sp : ffffffd00416bd90 [ 10.388963] gp : ffffffff811ee540 tp : ffffffe0841b3680 t0 : ffffffd00416bde0 [ 10.396172] t1 : ffffffd00416bdd8 t2 : 0000003ff09ca3a0 s0 : ffffffd00416bdc0 [ 10.403381] s1 : 0000000000000000 a0 : ffffffd00416bdd8 a1 : 0000000000000000 [ 10.410590] a2 : 0000000000000010 a3 : 0000000000000040 a4 : ffffffd00416be18 [ 10.417800] a5 : 0000003ffffffff0 a6 : 000000000000000f a7 : ffffffe085d58540 [ 10.425009] s2 : 0000000000000010 s3 : ffffffd00416bdd8 s4 : 0000000000000002 [ 10.432218] s5 : 0000000000000000 s6 : 0000000000000000 s7 : ffffffe0841b3680 [ 10.439427] s8 : 0000002aad788040 s9 : 0000000000000000 s10: 0000000000000001 [ 10.446636] s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000001 [ 10.453845] t5 : 0000000000000010 t6 : 0000000000040000 [ 10.459144] status: 0000000200040120 badaddr: 0000000000000000 cause: 000000000000000d [ 10.467049] [<ffffffff8099a280>] __asm_copy_from_user+0x48/0xf0 [ 10.472955] [<ffffffff8009a562>] do_seccomp+0x62/0x8be [ 10.478079] [<ffffffff8009af58>] prctl_set_seccomp+0x24/0x32 [ 10.483725] [<ffffffff80020756>] sys_prctl+0xf6/0x450 [ 10.488763] [<ffffffff800034f2>] ret_from_syscall+0x0/0x2 The PC register points to this line: +1: + REG_L a5, 0(a1) Qiu On 8/12/21 7:01 PM, Akira Tsukamoto wrote: > Hi Guenter, Geert and Qiu, > > Would you mind testing this patch? > Thanks, > > Akira > > On 7/30/2021 10:50 PM, Akira Tsukamoto wrote: >> Adding none unrolling word_copy, which is used if the size is smaller >> than 9*SZREG. >> >> This patch is based on Palmer's past comment. >>> My guess is that some workloads will want some smaller unrolling factors, >> It will reduce the number of slow byte_copy being used when the >> size is small. >> >> Have tested on qemu rv32, qemu rv64 and beaglev beta board. >> >> In the future, I am planning to convert uaccess.S to inline assembly >> in .c file. Then it will be easier to optimize on both in-order core and >> out-of-order core with #ifdef macro in c. >> >> Akira Tsukamoto (1): >> riscv: __asm_copy_to-from_user: Improve using word copy if size < >> 9*SZREG >> >> arch/riscv/lib/uaccess.S | 46 ++++++++++++++++++++++++++++++++++++---- >> 1 file changed, 42 insertions(+), 4 deletions(-) >>
Hi Qiu, On 8/15/2021 11:30 AM, Qiu Wenbo wrote: > Hi Akira, > > > This patch breaks my userspace and I can't boot my system after applying this. Here is the stack trace: > > > [ 10.349080] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 > [ 10.357116] Oops [#15] > [ 10.359433] CPU: 2 PID: 169 Comm: (networkd) Tainted: G D 5.14.0-rc5 #53 > [ 10.367422] Hardware name: SiFive HiFive Unmatched A00 (DT) > [ 10.372981] epc : __asm_copy_from_user+0x48/0xf0 > [ 10.377584] ra : _copy_from_user+0x28/0x68 > [ 10.381754] epc : ffffffff8099a280 ra : ffffffff803614a8 sp : ffffffd00416bd90 > [ 10.388963] gp : ffffffff811ee540 tp : ffffffe0841b3680 t0 : ffffffd00416bde0 > [ 10.396172] t1 : ffffffd00416bdd8 t2 : 0000003ff09ca3a0 s0 : ffffffd00416bdc0 > [ 10.403381] s1 : 0000000000000000 a0 : ffffffd00416bdd8 a1 : 0000000000000000 > [ 10.410590] a2 : 0000000000000010 a3 : 0000000000000040 a4 : ffffffd00416be18 > [ 10.417800] a5 : 0000003ffffffff0 a6 : 000000000000000f a7 : ffffffe085d58540 > [ 10.425009] s2 : 0000000000000010 s3 : ffffffd00416bdd8 s4 : 0000000000000002 > [ 10.432218] s5 : 0000000000000000 s6 : 0000000000000000 s7 : ffffffe0841b3680 > [ 10.439427] s8 : 0000002aad788040 s9 : 0000000000000000 s10: 0000000000000001 > [ 10.446636] s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000001 > [ 10.453845] t5 : 0000000000000010 t6 : 0000000000040000 > [ 10.459144] status: 0000000200040120 badaddr: 0000000000000000 cause: 000000000000000d > [ 10.467049] [<ffffffff8099a280>] __asm_copy_from_user+0x48/0xf0 > [ 10.472955] [<ffffffff8009a562>] do_seccomp+0x62/0x8be > [ 10.478079] [<ffffffff8009af58>] prctl_set_seccomp+0x24/0x32 > [ 10.483725] [<ffffffff80020756>] sys_prctl+0xf6/0x450 > [ 10.488763] [<ffffffff800034f2>] ret_from_syscall+0x0/0x2 > > > The PC register points to this line: > > +1: > + REG_L a5, 0(a1) Thanks for testing! Do you mind teaching me how to reproduce the error? Akira
Hi Akira, I can reproduce it on my HiFive Unmatched with a custom Gentoo rootfs. As pointed out by Andreas, there might be a missing fixup. I'm going to debug this issue myself since I can reproduce it fairly stable. Qiu On 8/16/21 14:24, Akira Tsukamoto wrote: > Hi Qiu, > > On 8/15/2021 11:30 AM, Qiu Wenbo wrote: >> Hi Akira, >> >> >> This patch breaks my userspace and I can't boot my system after applying this. Here is the stack trace: >> >> >> [ 10.349080] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 >> [ 10.357116] Oops [#15] >> [ 10.359433] CPU: 2 PID: 169 Comm: (networkd) Tainted: G D 5.14.0-rc5 #53 >> [ 10.367422] Hardware name: SiFive HiFive Unmatched A00 (DT) >> [ 10.372981] epc : __asm_copy_from_user+0x48/0xf0 >> [ 10.377584] ra : _copy_from_user+0x28/0x68 >> [ 10.381754] epc : ffffffff8099a280 ra : ffffffff803614a8 sp : ffffffd00416bd90 >> [ 10.388963] gp : ffffffff811ee540 tp : ffffffe0841b3680 t0 : ffffffd00416bde0 >> [ 10.396172] t1 : ffffffd00416bdd8 t2 : 0000003ff09ca3a0 s0 : ffffffd00416bdc0 >> [ 10.403381] s1 : 0000000000000000 a0 : ffffffd00416bdd8 a1 : 0000000000000000 >> [ 10.410590] a2 : 0000000000000010 a3 : 0000000000000040 a4 : ffffffd00416be18 >> [ 10.417800] a5 : 0000003ffffffff0 a6 : 000000000000000f a7 : ffffffe085d58540 >> [ 10.425009] s2 : 0000000000000010 s3 : ffffffd00416bdd8 s4 : 0000000000000002 >> [ 10.432218] s5 : 0000000000000000 s6 : 0000000000000000 s7 : ffffffe0841b3680 >> [ 10.439427] s8 : 0000002aad788040 s9 : 0000000000000000 s10: 0000000000000001 >> [ 10.446636] s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000001 >> [ 10.453845] t5 : 0000000000000010 t6 : 0000000000040000 >> [ 10.459144] status: 0000000200040120 badaddr: 0000000000000000 cause: 000000000000000d >> [ 10.467049] [<ffffffff8099a280>] __asm_copy_from_user+0x48/0xf0 >> [ 10.472955] [<ffffffff8009a562>] do_seccomp+0x62/0x8be >> [ 10.478079] [<ffffffff8009af58>] prctl_set_seccomp+0x24/0x32 >> [ 10.483725] [<ffffffff80020756>] sys_prctl+0xf6/0x450 >> [ 10.488763] [<ffffffff800034f2>] ret_from_syscall+0x0/0x2 >> >> >> The PC register points to this line: >> >> +1: >> + REG_L a5, 0(a1) > Thanks for testing! Do you mind teaching me how to reproduce the error? > > Akira >
Hi Qiu, On 8/16/2021 6:45 PM, Qiu Wenbo wrote: > Hi Akira, > > > I can reproduce it on my HiFive Unmatched with a custom Gentoo rootfs. As pointed out by Andreas, there might be a missing fixup. I'm going to debug this issue myself since I can reproduce it fairly stable. Ah! Now I understand the bug. > + REG_L a5, 0(a1) should be + fixup REG_L a5, 0(a1) If you do not mind, could you make the patch to add 'fixup' to all REG_S and REG_L? Then I will resubmit them to Palmer with your patch. Thanks, Akira > > > Qiu > > > On 8/16/21 14:24, Akira Tsukamoto wrote: >> Hi Qiu, >> >> On 8/15/2021 11:30 AM, Qiu Wenbo wrote: >>> Hi Akira, >>> >>> >>> This patch breaks my userspace and I can't boot my system after applying this. Here is the stack trace: >>> >>> >>> [ 10.349080] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 >>> [ 10.357116] Oops [#15] >>> [ 10.359433] CPU: 2 PID: 169 Comm: (networkd) Tainted: G D 5.14.0-rc5 #53 >>> [ 10.367422] Hardware name: SiFive HiFive Unmatched A00 (DT) >>> [ 10.372981] epc : __asm_copy_from_user+0x48/0xf0 >>> [ 10.377584] ra : _copy_from_user+0x28/0x68 >>> [ 10.381754] epc : ffffffff8099a280 ra : ffffffff803614a8 sp : ffffffd00416bd90 >>> [ 10.388963] gp : ffffffff811ee540 tp : ffffffe0841b3680 t0 : ffffffd00416bde0 >>> [ 10.396172] t1 : ffffffd00416bdd8 t2 : 0000003ff09ca3a0 s0 : ffffffd00416bdc0 >>> [ 10.403381] s1 : 0000000000000000 a0 : ffffffd00416bdd8 a1 : 0000000000000000 >>> [ 10.410590] a2 : 0000000000000010 a3 : 0000000000000040 a4 : ffffffd00416be18 >>> [ 10.417800] a5 : 0000003ffffffff0 a6 : 000000000000000f a7 : ffffffe085d58540 >>> [ 10.425009] s2 : 0000000000000010 s3 : ffffffd00416bdd8 s4 : 0000000000000002 >>> [ 10.432218] s5 : 0000000000000000 s6 : 0000000000000000 s7 : ffffffe0841b3680 >>> [ 10.439427] s8 : 0000002aad788040 s9 : 0000000000000000 s10: 0000000000000001 >>> [ 10.446636] s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000001 >>> [ 10.453845] t5 : 0000000000000010 t6 : 0000000000040000 >>> [ 10.459144] status: 0000000200040120 badaddr: 0000000000000000 cause: 000000000000000d >>> [ 10.467049] [<ffffffff8099a280>] __asm_copy_from_user+0x48/0xf0 >>> [ 10.472955] [<ffffffff8009a562>] do_seccomp+0x62/0x8be >>> [ 10.478079] [<ffffffff8009af58>] prctl_set_seccomp+0x24/0x32 >>> [ 10.483725] [<ffffffff80020756>] sys_prctl+0xf6/0x450 >>> [ 10.488763] [<ffffffff800034f2>] ret_from_syscall+0x0/0x2 >>> >>> >>> The PC register points to this line: >>> >>> +1: >>> + REG_L a5, 0(a1) >> Thanks for testing! Do you mind teaching me how to reproduce the error? >> >> Akira >> > > >
Adding none unrolling word_copy, which is used if the size is smaller than 9*SZREG. This patch is based on Palmer's past comment. > My guess is that some workloads will want some smaller unrolling factors, It will reduce the number of slow byte_copy being used when the size is small. Have tested on qemu rv32, qemu rv64 and beaglev beta board. In the future, I am planning to convert uaccess.S to inline assembly in .c file. Then it will be easier to optimize on both in-order core and out-of-order core with #ifdef macro in c. Akira Tsukamoto (1): riscv: __asm_copy_to-from_user: Improve using word copy if size < 9*SZREG arch/riscv/lib/uaccess.S | 46 ++++++++++++++++++++++++++++++++++++---- 1 file changed, 42 insertions(+), 4 deletions(-)