mbox series

[00/13] tools/nolibc: riscv: Add full rv32 support

Message ID cover.1684949267.git.falcon@tinylab.org (mailing list archive)
Headers show
Series tools/nolibc: riscv: Add full rv32 support | expand

Message

Zhangjin Wu May 24, 2023, 5:33 p.m. UTC
Hi, Willy

Thanks very mush for your kindly review, discuss and suggestion, now we
get full rv32 support ;-)

In the first series [1], we have fixed up the compile errors about
_start and __NR_llseek for rv32, but left compile errors about tons of
time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use
latest system call ABI")) and the missing fstat in nolibc-test.c [2],
now we have fixed up all of them.

Introduction
============

This series is based on the 20230524-nolibc-rv32+stkp4 branch of [3], it
includes 3 parts, they work together to add full rv32 support:

* Reverts two old out-of-day patches
  * Revert "tools/nolibc: riscv: Support __NR_llseek for rv32"
  * Revert "selftests/nolibc: Fix up compile error for rv32"

  (these two and the reverted ones:

    * commit 606343b7478c ("selftests/nolibc: Fix up compile error for rv32") 
    * commit d2c3acba6d66 ("tools/nolibc: riscv: Support __NR_llseek for rv32")

  can be removed from the git repo completely, there are two new ones to replace
  them)

* Compile and test support patches
  * selftests/nolibc: print name instead of number for EOVERFLOW
  * selftests/nolibc: syscall_args: use __NR_statx for rv32
    * --> replace the old one 606343b7478, use statx instead of read
  * selftests/nolibc: riscv: customize makefile for rv32
  * selftests/nolibc: allow specify a bios for qemu
  * selftests/nolibc: remove the duplicated gettimeofday_bad2

* Fix up some missing syscalls, mainly time32 syscalls
  * tools/nolibc: sys_lseek: riscv: use __NR_llseek for rv32
    * --> replace the old one d2c3acba6d66, cleaned up 
  * tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64 for rv32
  * tools/nolibc: ppoll/ppoll_time64: Add a missing argument
  * tools/nolibc: sys_select: riscv: use __NR_pselect6_time64 for rv32
  * tools/nolibc: sys_wait4: riscv: use __NR_waitid for rv32
  * tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64 for rv32

Compile
=======

For rv64:

    $ make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
    $ file nolibc-test
    nolibc-test: ELF 64-bit LSB executable, UCB RISC-V ...

    $ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
    $ file nolibc-test
    nolibc-test: ELF 64-bit LSB executable, UCB RISC-V ...

For rv32:

    $ make ARCH=riscv CONFIG_32BIT=1 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
    $ file nolibc-test
    nolibc-test: ELF 32-bit LSB executable, UCB RISC-V ...

    $ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
    $ file nolibc-test
    nolibc-test: ELF 32-bit LSB executable, UCB RISC-V ...

Testing
=======

Environment:

    // gcc toolchain
    $ riscv64-linux-gnu-gcc --version
    riscv64-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
    Copyright (C) 2019 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

    // glibc >= 2.33 required, for older glibc, must upgrade include/bits/wordsize.h
    $ dpkg -l | grep libc6-dev | grep riscv
    ii  libc6-dev-riscv64-cross                  2.31-0ubuntu7cross1

    // glibc include/bits/wordsize.h: manually upgraded to >= 2.33
    // without this, can not build tools/testing/selftests/nolibc/nolibc-test.c
    $ cat /usr/riscv64-linux-gnu/include/bits/wordsize.h
    #if __riscv_xlen == (__SIZEOF_POINTER__ * 8)
    # define __WORDSIZE __riscv_xlen
    #else
    # error unsupported ABI
    #endif

    # define __WORDSIZE_TIME64_COMPAT32 1

    #if __WORDSIZE == 32
    # define __WORDSIZE32_SIZE_ULONG    0
    # define __WORDSIZE32_PTRDIFF_LONG  0
    #endif

    // higher qemu version is better, latest version is v8.0.0+
    $ qemu-system-riscv64  --version
    QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.18)
    Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers

    // opensbi version, higher is better, must match kernel version and qemu version
    // rv64: used version is 1.2, latest is 1.2
    $ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1
    OpenSBI v1.2-116-g7919530
    // rv32: used version is v0.9, latest is 1.2
    $ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1
    OpenSBI v0.9-152-g754d511

For rv64:

    $ pwd
    /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc
    $ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- defconfig
    $ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- BIOS=/labs/linux-lab/boards/riscv64/virt/bsp/bios/opensbi/generic/fw_jump.elf run
        MKDIR   sysroot/riscv/include
      make[1]: Entering directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
      make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
      make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
      make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
        INSTALL /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/sysroot/sysroot/include
      make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
      make[1]: Leaving directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
        CC      nolibc-test
        MKDIR   initramfs
        INSTALL initramfs/init
      make[1]: Entering directory '/labs/linux-lab/src/linux-stable'
        ...
        LD      vmlinux
        NM      System.map
        SORTTAB vmlinux
        OBJCOPY arch/riscv/boot/Image
        Kernel: arch/riscv/boot/Image is ready
      make[1]: Leaving directory '/labs/linux-lab/src/linux-stable'
      135 test(s) passed.
    $ file ../../../../vmlinux
    ../../../../vmlinux: ELF 64-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=b8e1cea5122b04bce540b4022f0d6f171ffe615a, not stripped

For rv32:

    $ pwd
    /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc
    $ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- defconfig
    $ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- BIOS=/labs/linux-lab/boards/riscv32/virt/bsp/bios/opensbi/generic/fw_jump.elf run
          MKDIR   sysroot/riscv/include
    make[1]: Entering directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
    make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
    make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
    make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
      INSTALL /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/sysroot/sysroot/include
    make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
    make[1]: Leaving directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
      CC      nolibc-test
      MKDIR   initramfs
      INSTALL initramfs/init
    make[1]: Entering directory '/labs/linux-lab/src/linux-stable'
      CALL    scripts/checksyscalls.sh
      GEN     usr/initramfs_data.cpio
      COPY    usr/initramfs_inc_data
      AS      usr/initramfs_data.o
      AR      usr/built-in.a
      GEN     security/selinux/flask.h security/selinux/av_permissions.h
      CC      security/selinux/avc.o
      CC      security/selinux/hooks.o
      CC      security/selinux/selinuxfs.o
      CC      security/selinux/nlmsgtab.o
      CC      security/selinux/netif.o
      CC      security/selinux/netnode.o
      CC      security/selinux/netport.o
      CC      security/selinux/status.o
      CC      security/selinux/ss/services.o
      AR      security/selinux/built-in.a
      AR      security/built-in.a
      AR      built-in.a
      AR      vmlinux.a
      LD      vmlinux.o
      OBJCOPY modules.builtin.modinfo
      GEN     modules.builtin
      MODPOST vmlinux.symvers
      UPD     include/generated/utsversion.h
      CC      init/version-timestamp.o
      LD      .tmp_vmlinux.kallsyms1
      NM      .tmp_vmlinux.kallsyms1.syms
      KSYMS   .tmp_vmlinux.kallsyms1.S
      AS      .tmp_vmlinux.kallsyms1.S
      LD      .tmp_vmlinux.kallsyms2
      NM      .tmp_vmlinux.kallsyms2.syms
      KSYMS   .tmp_vmlinux.kallsyms2.S
      AS      .tmp_vmlinux.kallsyms2.S
      LD      vmlinux
      NM      System.map
      SORTTAB vmlinux
      OBJCOPY arch/riscv/boot/Image
      Kernel: arch/riscv/boot/Image is ready
    make[1]: Leaving directory '/labs/linux-lab/src/linux-stable'
    135 test(s) passed.
    $ file ../../../../vmlinux
    ../../../../vmlinux: ELF 32-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=bad4c1f3899f47355d2a2010bade56972fd94b9d, not stripped
 
The full rv64 testing result (run.out) is uploaded at [4].
The full rv32 testing result (run.out) is uploaded at [5].

That's all, thanks!

Best regards,
Zhangjin Wu
---

[1]: https://lore.kernel.org/linux-riscv/20230520143154.68663-1-falcon@tinylab.org/T/#mf0e54ee19bd3f94dadbb4420ed9dffa316d1719d
[2]: https://lore.kernel.org/linux-riscv/20230520135235.68155-1-falcon@tinylab.org/T/#u
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
[4]: https://pastebin.com/3L0nV78u
[5]: https://pastebin.com/RadrXdta 



Zhangjin Wu (13):
  Revert "tools/nolibc: riscv: Support __NR_llseek for rv32"
  Revert "selftests/nolibc: Fix up compile error for rv32"
  selftests/nolibc: print name instead of number for EOVERFLOW
  selftests/nolibc: syscall_args: use __NR_statx for rv32
  selftests/nolibc: riscv: customize makefile for rv32
  selftests/nolibc: allow specify a bios for qemu
  selftests/nolibc: remove the duplicated gettimeofday_bad2
  tools/nolibc: sys_lseek: riscv: use __NR_llseek for rv32
  tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64 for rv32
  tools/nolibc: ppoll/ppoll_time64: Add a missing argument
  tools/nolibc: sys_select: riscv: use __NR_pselect6_time64 for rv32
  tools/nolibc: sys_wait4: riscv: use __NR_waitid for rv32
  tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64 for
    rv32

 tools/include/nolibc/std.h                   |   1 +
 tools/include/nolibc/sys.h                   | 135 +++++++++++++++++--
 tools/include/nolibc/types.h                 |  21 ++-
 tools/testing/selftests/nolibc/Makefile      |  14 +-
 tools/testing/selftests/nolibc/nolibc-test.c |  15 ++-
 5 files changed, 167 insertions(+), 19 deletions(-)

Comments

Zhangjin Wu May 24, 2023, 6:24 p.m. UTC | #1
Hi,

Just to mention, the 3rd one is missing in the riscv-linux mailing list
[1], but it is ok in the other two [2], [3], it was sent with the same
command ;-(

[1]: https://lore.kernel.org/linux-riscv/cover.1684949267.git.falcon@tinylab.org/T/#m1c2c31ec2f5dfafc7d0067a1e5fe430d591d74b8
[2]: https://lore.kernel.org/lkml/cover.1684949267.git.falcon@tinylab.org/T/#m1c2c31ec2f5dfafc7d0067a1e5fe430d591d74b8
[3]: https://lore.kernel.org/linux-kselftest/cover.1684949267.git.falcon@tinylab.org/T/#t

If required, do we need to resend the 3rd to riscv-linux only?

Thanks,
Zhangjin

> Hi, Willy
> 
> Thanks very mush for your kindly review, discuss and suggestion, now we
> get full rv32 support ;-)
> 
> In the first series [1], we have fixed up the compile errors about
> _start and __NR_llseek for rv32, but left compile errors about tons of
> time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use
> latest system call ABI")) and the missing fstat in nolibc-test.c [2],
> now we have fixed up all of them.
> 
> Introduction
> ============
> 
> This series is based on the 20230524-nolibc-rv32+stkp4 branch of [3], it
> includes 3 parts, they work together to add full rv32 support:
> 
> * Reverts two old out-of-day patches
>   * Revert "tools/nolibc: riscv: Support __NR_llseek for rv32"
>   * Revert "selftests/nolibc: Fix up compile error for rv32"
> 
>   (these two and the reverted ones:
> 
>     * commit 606343b7478c ("selftests/nolibc: Fix up compile error for rv32") 
>     * commit d2c3acba6d66 ("tools/nolibc: riscv: Support __NR_llseek for rv32")
> 
>   can be removed from the git repo completely, there are two new ones to replace
>   them)
> 
> * Compile and test support patches
>   * selftests/nolibc: print name instead of number for EOVERFLOW
>   * selftests/nolibc: syscall_args: use __NR_statx for rv32
>     * --> replace the old one 606343b7478, use statx instead of read
>   * selftests/nolibc: riscv: customize makefile for rv32
>   * selftests/nolibc: allow specify a bios for qemu
>   * selftests/nolibc: remove the duplicated gettimeofday_bad2
> 
> * Fix up some missing syscalls, mainly time32 syscalls
>   * tools/nolibc: sys_lseek: riscv: use __NR_llseek for rv32
>     * --> replace the old one d2c3acba6d66, cleaned up 
>   * tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64 for rv32
>   * tools/nolibc: ppoll/ppoll_time64: Add a missing argument
>   * tools/nolibc: sys_select: riscv: use __NR_pselect6_time64 for rv32
>   * tools/nolibc: sys_wait4: riscv: use __NR_waitid for rv32
>   * tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64 for rv32
> 
> Compile
> =======
> 
> For rv64:
> 
>     $ make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
>     $ file nolibc-test
>     nolibc-test: ELF 64-bit LSB executable, UCB RISC-V ...
> 
>     $ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
>     $ file nolibc-test
>     nolibc-test: ELF 64-bit LSB executable, UCB RISC-V ...
> 
> For rv32:
> 
>     $ make ARCH=riscv CONFIG_32BIT=1 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
>     $ file nolibc-test
>     nolibc-test: ELF 32-bit LSB executable, UCB RISC-V ...
> 
>     $ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- nolibc-test
>     $ file nolibc-test
>     nolibc-test: ELF 32-bit LSB executable, UCB RISC-V ...
> 
> Testing
> =======
> 
> Environment:
> 
>     // gcc toolchain
>     $ riscv64-linux-gnu-gcc --version
>     riscv64-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
>     Copyright (C) 2019 Free Software Foundation, Inc.
>     This is free software; see the source for copying conditions.  There is NO
>     warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> 
>     // glibc >= 2.33 required, for older glibc, must upgrade include/bits/wordsize.h
>     $ dpkg -l | grep libc6-dev | grep riscv
>     ii  libc6-dev-riscv64-cross                  2.31-0ubuntu7cross1
> 
>     // glibc include/bits/wordsize.h: manually upgraded to >= 2.33
>     // without this, can not build tools/testing/selftests/nolibc/nolibc-test.c
>     $ cat /usr/riscv64-linux-gnu/include/bits/wordsize.h
>     #if __riscv_xlen == (__SIZEOF_POINTER__ * 8)
>     # define __WORDSIZE __riscv_xlen
>     #else
>     # error unsupported ABI
>     #endif
> 
>     # define __WORDSIZE_TIME64_COMPAT32 1
> 
>     #if __WORDSIZE == 32
>     # define __WORDSIZE32_SIZE_ULONG    0
>     # define __WORDSIZE32_PTRDIFF_LONG  0
>     #endif
> 
>     // higher qemu version is better, latest version is v8.0.0+
>     $ qemu-system-riscv64  --version
>     QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.18)
>     Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers
> 
>     // opensbi version, higher is better, must match kernel version and qemu version
>     // rv64: used version is 1.2, latest is 1.2
>     $ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1
>     OpenSBI v1.2-116-g7919530
>     // rv32: used version is v0.9, latest is 1.2
>     $ head -2 /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/run.out | tail -1
>     OpenSBI v0.9-152-g754d511
> 
> For rv64:
> 
>     $ pwd
>     /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc
>     $ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- defconfig
>     $ make ARCH=riscv64 CROSS_COMPILE=riscv64-linux-gnu- BIOS=/labs/linux-lab/boards/riscv64/virt/bsp/bios/opensbi/generic/fw_jump.elf run
>         MKDIR   sysroot/riscv/include
>       make[1]: Entering directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
>       make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
>       make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
>       make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
>         INSTALL /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/sysroot/sysroot/include
>       make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
>       make[1]: Leaving directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
>         CC      nolibc-test
>         MKDIR   initramfs
>         INSTALL initramfs/init
>       make[1]: Entering directory '/labs/linux-lab/src/linux-stable'
>         ...
>         LD      vmlinux
>         NM      System.map
>         SORTTAB vmlinux
>         OBJCOPY arch/riscv/boot/Image
>         Kernel: arch/riscv/boot/Image is ready
>       make[1]: Leaving directory '/labs/linux-lab/src/linux-stable'
>       135 test(s) passed.
>     $ file ../../../../vmlinux
>     ../../../../vmlinux: ELF 64-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=b8e1cea5122b04bce540b4022f0d6f171ffe615a, not stripped
> 
> For rv32:
> 
>     $ pwd
>     /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc
>     $ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- defconfig
>     $ make ARCH=riscv32 CROSS_COMPILE=riscv64-linux-gnu- BIOS=/labs/linux-lab/boards/riscv32/virt/bsp/bios/opensbi/generic/fw_jump.elf run
>           MKDIR   sysroot/riscv/include
>     make[1]: Entering directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
>     make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
>     make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
>     make[2]: Entering directory '/labs/linux-lab/src/linux-stable'
>       INSTALL /labs/linux-lab/src/linux-stable/tools/testing/selftests/nolibc/sysroot/sysroot/include
>     make[2]: Leaving directory '/labs/linux-lab/src/linux-stable'
>     make[1]: Leaving directory '/labs/linux-lab/src/linux-stable/tools/include/nolibc'
>       CC      nolibc-test
>       MKDIR   initramfs
>       INSTALL initramfs/init
>     make[1]: Entering directory '/labs/linux-lab/src/linux-stable'
>       CALL    scripts/checksyscalls.sh
>       GEN     usr/initramfs_data.cpio
>       COPY    usr/initramfs_inc_data
>       AS      usr/initramfs_data.o
>       AR      usr/built-in.a
>       GEN     security/selinux/flask.h security/selinux/av_permissions.h
>       CC      security/selinux/avc.o
>       CC      security/selinux/hooks.o
>       CC      security/selinux/selinuxfs.o
>       CC      security/selinux/nlmsgtab.o
>       CC      security/selinux/netif.o
>       CC      security/selinux/netnode.o
>       CC      security/selinux/netport.o
>       CC      security/selinux/status.o
>       CC      security/selinux/ss/services.o
>       AR      security/selinux/built-in.a
>       AR      security/built-in.a
>       AR      built-in.a
>       AR      vmlinux.a
>       LD      vmlinux.o
>       OBJCOPY modules.builtin.modinfo
>       GEN     modules.builtin
>       MODPOST vmlinux.symvers
>       UPD     include/generated/utsversion.h
>       CC      init/version-timestamp.o
>       LD      .tmp_vmlinux.kallsyms1
>       NM      .tmp_vmlinux.kallsyms1.syms
>       KSYMS   .tmp_vmlinux.kallsyms1.S
>       AS      .tmp_vmlinux.kallsyms1.S
>       LD      .tmp_vmlinux.kallsyms2
>       NM      .tmp_vmlinux.kallsyms2.syms
>       KSYMS   .tmp_vmlinux.kallsyms2.S
>       AS      .tmp_vmlinux.kallsyms2.S
>       LD      vmlinux
>       NM      System.map
>       SORTTAB vmlinux
>       OBJCOPY arch/riscv/boot/Image
>       Kernel: arch/riscv/boot/Image is ready
>     make[1]: Leaving directory '/labs/linux-lab/src/linux-stable'
>     135 test(s) passed.
>     $ file ../../../../vmlinux
>     ../../../../vmlinux: ELF 32-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=bad4c1f3899f47355d2a2010bade56972fd94b9d, not stripped
>  
> The full rv64 testing result (run.out) is uploaded at [4].
> The full rv32 testing result (run.out) is uploaded at [5].
> 
> That's all, thanks!
> 
> Best regards,
> Zhangjin Wu
> ---
> 
> [1]: https://lore.kernel.org/linux-riscv/20230520143154.68663-1-falcon@tinylab.org/T/#mf0e54ee19bd3f94dadbb4420ed9dffa316d1719d
> [2]: https://lore.kernel.org/linux-riscv/20230520135235.68155-1-falcon@tinylab.org/T/#u
> [3]: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git
> [4]: https://pastebin.com/3L0nV78u
> [5]: https://pastebin.com/RadrXdta 
> 
> 
> 
> Zhangjin Wu (13):
>   Revert "tools/nolibc: riscv: Support __NR_llseek for rv32"
>   Revert "selftests/nolibc: Fix up compile error for rv32"
>   selftests/nolibc: print name instead of number for EOVERFLOW
>   selftests/nolibc: syscall_args: use __NR_statx for rv32
>   selftests/nolibc: riscv: customize makefile for rv32
>   selftests/nolibc: allow specify a bios for qemu
>   selftests/nolibc: remove the duplicated gettimeofday_bad2
>   tools/nolibc: sys_lseek: riscv: use __NR_llseek for rv32
>   tools/nolibc: sys_poll: riscv: use __NR_ppoll_time64 for rv32
>   tools/nolibc: ppoll/ppoll_time64: Add a missing argument
>   tools/nolibc: sys_select: riscv: use __NR_pselect6_time64 for rv32
>   tools/nolibc: sys_wait4: riscv: use __NR_waitid for rv32
>   tools/nolibc: sys_gettimeofday: riscv: use __NR_clock_gettime64 for
>     rv32
> 
>  tools/include/nolibc/std.h                   |   1 +
>  tools/include/nolibc/sys.h                   | 135 +++++++++++++++++--
>  tools/include/nolibc/types.h                 |  21 ++-
>  tools/testing/selftests/nolibc/Makefile      |  14 +-
>  tools/testing/selftests/nolibc/nolibc-test.c |  15 ++-
>  5 files changed, 167 insertions(+), 19 deletions(-)
> 
> -- 
> 2.25.1
Willy Tarreau May 28, 2023, 7:59 a.m. UTC | #2
Hi Zhangjin,

On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
> Hi, Willy
> 
> Thanks very mush for your kindly review, discuss and suggestion, now we
> get full rv32 support ;-)
> 
> In the first series [1], we have fixed up the compile errors about
> _start and __NR_llseek for rv32, but left compile errors about tons of
> time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use
> latest system call ABI")) and the missing fstat in nolibc-test.c [2],
> now we have fixed up all of them.

(...)

I have read the comments that others made on the series and overall
agree. I've seen that you intend to prepare a v2. I think we must
first decide how to better deal with emulated syscalls as I said in
an earlier message. Probably that we should just add a specific test
case for EFAULT in nolibc-test since it's the only one (I think) that
risks to trigger crashes with emulated syscalls. We could also imagine
dealing with the signal ourselves but I'm not that keen on going to
implement signal() & longjmp() for now :-/

Regardless, in order to clean the things up and relieve you from the
non-rv32 stuff, I've just reverted the two patches that your series
reverts (1 & 2), and added the EOVERFLOW one (3). I'm pushing this to
branch 20230528-nolibc-rv32+stkp5.

Regards,
Willy
Thomas Weißschuh May 28, 2023, 8:42 a.m. UTC | #3
On 2023-05-28 09:59:55+0200, Willy Tarreau wrote:
> On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
> > Thanks very mush for your kindly review, discuss and suggestion, now we
> > get full rv32 support ;-)
> > 
> > In the first series [1], we have fixed up the compile errors about
> > _start and __NR_llseek for rv32, but left compile errors about tons of
> > time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use
> > latest system call ABI")) and the missing fstat in nolibc-test.c [2],
> > now we have fixed up all of them.
> 
> (...)
> 
> I have read the comments that others made on the series and overall
> agree. I've seen that you intend to prepare a v2. I think we must
> first decide how to better deal with emulated syscalls as I said in
> an earlier message. Probably that we should just add a specific test
> case for EFAULT in nolibc-test since it's the only one (I think) that
> risks to trigger crashes with emulated syscalls. We could also imagine
> dealing with the signal ourselves but I'm not that keen on going to
> implement signal() & longjmp() for now :-/
> 
> Regardless, in order to clean the things up and relieve you from the
> non-rv32 stuff, I've just reverted the two patches that your series
> reverts (1 & 2), and added the EOVERFLOW one (3). I'm pushing this to
> branch 20230528-nolibc-rv32+stkp5.

If you are fine with pushing more stuff to this branch, picking up 
the fix for the duplicated test gettimeofday_bad2 (7) would be nice, too.

Thomas
Thomas Weißschuh May 28, 2023, 9:41 a.m. UTC | #4
On 2023-05-28 10:42:39+0200, Thomas Weißschuh wrote:
> On 2023-05-28 09:59:55+0200, Willy Tarreau wrote:
> > On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
> > > Thanks very mush for your kindly review, discuss and suggestion, now we
> > > get full rv32 support ;-)
> > > 
> > > In the first series [1], we have fixed up the compile errors about
> > > _start and __NR_llseek for rv32, but left compile errors about tons of
> > > time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use
> > > latest system call ABI")) and the missing fstat in nolibc-test.c [2],
> > > now we have fixed up all of them.
> > 
> > (...)
> > 
> > I have read the comments that others made on the series and overall
> > agree. I've seen that you intend to prepare a v2. I think we must
> > first decide how to better deal with emulated syscalls as I said in
> > an earlier message. Probably that we should just add a specific test
> > case for EFAULT in nolibc-test since it's the only one (I think) that
> > risks to trigger crashes with emulated syscalls. We could also imagine
> > dealing with the signal ourselves but I'm not that keen on going to
> > implement signal() & longjmp() for now :-/
> > 
> > Regardless, in order to clean the things up and relieve you from the
> > non-rv32 stuff, I've just reverted the two patches that your series
> > reverts (1 & 2), and added the EOVERFLOW one (3). I'm pushing this to
> > branch 20230528-nolibc-rv32+stkp5.
> 
> If you are fine with pushing more stuff to this branch, picking up 
> the fix for the duplicated test gettimeofday_bad2 (7) would be nice, too.

And the ppoll() argument cleanup (10) for that matter.

Zhangjin:

IMO it would be more convenient to move generic cleanup patches to the
beginning of the series.
When the reviewers are focussing on the real changes they won't be
interrupted by the cleanups. Also the maintainer can more easily pick
them up independently, so they are dealt with and nobody has to worry
about them anymore.

Thomas
Willy Tarreau May 28, 2023, 10:17 a.m. UTC | #5
On Sun, May 28, 2023 at 11:41:53AM +0200, Thomas Weißschuh wrote:
> > If you are fine with pushing more stuff to this branch, picking up 
> > the fix for the duplicated test gettimeofday_bad2 (7) would be nice, too.
> 
> And the ppoll() argument cleanup (10) for that matter.

OK now both done, thank you.

> IMO it would be more convenient to move generic cleanup patches to the
> beginning of the series.
> When the reviewers are focussing on the real changes they won't be
> interrupted by the cleanups. Also the maintainer can more easily pick
> them up independently, so they are dealt with and nobody has to worry
> about them anymore.

Agreed!

Thanks!
Willy
Zhangjin Wu May 28, 2023, 10:39 a.m. UTC | #6
Hi, Willy

> Hi Zhangjin,
> 
> On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
> > Hi, Willy
> > 
> > Thanks very mush for your kindly review, discuss and suggestion, now we
> > get full rv32 support ;-)
> > 
> > In the first series [1], we have fixed up the compile errors about
> > _start and __NR_llseek for rv32, but left compile errors about tons of
> > time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use
> > latest system call ABI")) and the missing fstat in nolibc-test.c [2],
> > now we have fixed up all of them.
> 
> (...)
> 
> I have read the comments that others made on the series and overall
> agree. I've seen that you intend to prepare a v2. I think we must
> first decide how to better deal with emulated syscalls as I said in
> an earlier message. Probably that we should just add a specific test
> case for EFAULT in nolibc-test since it's the only one (I think) that
> risks to trigger crashes with emulated syscalls. We could also imagine
> dealing with the signal ourselves but I'm not that keen on going to
> implement signal() & longjmp() for now :-/
>

Yes, user-space signal() may be the right direction, we just need to let
user-space not crash the kernel, what about this 'solution' for current stage
(consider the pure time64 support too):

    #if defined(NOLIBC) && defined(__NR_gettimeofday) && __SIZEOF_LONG__ == 8
		CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break;
		CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;
    #endif

This idea is from your commit 1da02f51088 ("selftests/nolibc: support glibc as
well") for glibc, but the difference is of course glibc not crashes the kernel.

Btw, since the gettimeofday_null case may be optimized by compiler and not
trigger such errors:

    // rv32
    nolibc-test.c:(.text.run_syscall+0x8c0): undefined reference to `__divdi3'

    // arm32
    nolibc-test.c:(.text.run_syscall+0x820): undefined reference to `__aeabi_ldivmod'

The above errors have been hidden after the disabling of the gettimeofday_bad1
test case, so, still need to solve it before sending v2.

The method used by musl may work, but the high bits may be lost (from long long
to int)?
 
	tv->tv_usec = (int)ts.tv_nsec / 1000;

Perhaps we really need to add the missing __divdi3 and __aeabi_ldivmod and the
ones for the other architectures, or get one from lib/math/div64.c.

Will add such new test cases to detect the above issues:

    CASE_TEST(gettimeofday_tv);   EXPECT_SYSZR(1, gettimeofday(&tv, NULL)); break;
    CASE_TEST(gettimeofday_tz);   EXPECT_SYSZR(1, gettimeofday(NULL, &tz)); break;
    CASE_TEST(gettimeofday_tv_tz);EXPECT_SYSZR(1, gettimeofday(&tv, &tz)); break;

May still require to add 'used' attribute to 'struct timeval tv' and 'struct
timeval tz' to let compiler not optimize them away.

For the waitid syscall based waitpid INT_MIN test case, I have prepared such
code:

    #define IF_TEST(name) \
    	if (strcmp(test, #name) == 0)

    const int _errorno(const char *test)
    {
    #ifdef __NR_wait4
    	IF_TEST(waitpid_min); return ESRCH;
    #else /* __NR_waitid */
    	IF_TEST(waitpid_min); return EINVAL;
    #endif
    	return 0;
    }

    #define errorno(test) _errorno(#test)

    CASE_TEST(waitpid_min);       EXPECT_SYSER(1, waitpid(INT_MIN, &tmp, WNOHANG), -1, errorno(waitpid_min)); break;

Instead of simply disabling this case, the above code allows to return
different values for different syscalls.

is a standalone errorno_waitpid_min() better? I just want to let future test
cases share some code, but it may be slower ;-)

> Regardless, in order to clean the things up and relieve you from the
> non-rv32 stuff, I've just reverted the two patches that your series
> reverts (1 & 2), and added the EOVERFLOW one (3). I'm pushing this to
> branch 20230528-nolibc-rv32+stkp5.
>

Thanks very much and I have seen another two have been pushed too, will rebase
everything on this new branch.

Based on the other suggestions from you and Thomas, I plan to send some generic
and independent changes at first, and then the left hard parts, It may simplify
the whole progress.

Best regards,
Zhangjin
 
> Regards,
> Willy
Willy Tarreau May 28, 2023, 11:33 a.m. UTC | #7
On Sun, May 28, 2023 at 06:39:57PM +0800, Zhangjin Wu wrote:
> > I have read the comments that others made on the series and overall
> > agree. I've seen that you intend to prepare a v2. I think we must
> > first decide how to better deal with emulated syscalls as I said in
> > an earlier message. Probably that we should just add a specific test
> > case for EFAULT in nolibc-test since it's the only one (I think) that
> > risks to trigger crashes with emulated syscalls. We could also imagine
> > dealing with the signal ourselves but I'm not that keen on going to
> > implement signal() & longjmp() for now :-/
> >
> 
> Yes, user-space signal() may be the right direction, we just need to let
> user-space not crash the kernel, what about this 'solution' for current stage
> (consider the pure time64 support too):
> 
>     #if defined(NOLIBC) && defined(__NR_gettimeofday) && __SIZEOF_LONG__ == 8
> 		CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break;
> 		CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;
>     #endif
> 
> This idea is from your commit 1da02f51088 ("selftests/nolibc: support glibc as
> well") for glibc, but the difference is of course glibc not crashes the kernel.

Well, I was imagining implementing an EXPECT_EFAULT() macro that would
rely on whatever other macros we'd set to indicate that a syscall got
remapped. But I had another check grepping for EFAULT:

      CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break;
      CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;
      CASE_TEST(poll_fault);        EXPECT_SYSER(1, poll((void *)1, 1, 0), -1, EFAULT); break;
      CASE_TEST(prctl);             EXPECT_SYSER(1, prctl(PR_SET_NAME, (unsigned long)NULL, 0, 0, 0), -1, EFAULT); break;
      CASE_TEST(select_fault);      EXPECT_SYSER(1, select(1, (void *)1, NULL, NULL, 0), -1, EFAULT); break;
      CASE_TEST(stat_fault);        EXPECT_SYSER(1, stat(NULL, &stat_buf), -1, EFAULT); break;
      CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;

In short, they're very few, and several of these could simply be dropped
as irrelevant once we know that the libc is able to remap them and
dereference the arguments itself.

I'd be fine with dropping the two gettimeofday_bad ones, poll_fault,
select_fault and stat_fault. These ones already have at least one or
two other tests. These ones were initially added because they were
easy to implement, but if they're not relevant we can drop them and
stop wondering how to hack around the tests.

If that's OK for you as well I can do that.

> Btw, since the gettimeofday_null case may be optimized by compiler and not
> trigger such errors:
> 
>     // rv32
>     nolibc-test.c:(.text.run_syscall+0x8c0): undefined reference to `__divdi3'
> 
>     // arm32
>     nolibc-test.c:(.text.run_syscall+0x820): undefined reference to `__aeabi_ldivmod'
> 
> The above errors have been hidden after the disabling of the gettimeofday_bad1
> test case, so, still need to solve it before sending v2.

Sorry, I don't understand what you mean, I'm not seeing such a divide in
the code. Or maybe you're speaking about what you got after some of your
proposed changes ?

> The method used by musl may work, but the high bits may be lost (from long long
> to int)?
>  
> 	tv->tv_usec = (int)ts.tv_nsec / 1000;

Yes, and it would be even cleaner to use a uint here since tv_nsec is
always positive. This will simply result in a multiplication and a
shift on most platforms. Of course that's the type of thing you normally
don't want on a fast path for some small systems but here code compacity
counts more and that's fine.

> Perhaps we really need to add the missing __divdi3 and __aeabi_ldivmod and the
> ones for the other architectures, or get one from lib/math/div64.c.

No, these ones come from the compiler via libgcc_s, we must not try to
reimplement them. And we should do our best to avoid depending on them
to avoid the error you got above.

> Will add such new test cases to detect the above issues:
> 
>     CASE_TEST(gettimeofday_tv);   EXPECT_SYSZR(1, gettimeofday(&tv, NULL)); break;
>     CASE_TEST(gettimeofday_tz);   EXPECT_SYSZR(1, gettimeofday(NULL, &tz)); break;
>     CASE_TEST(gettimeofday_tv_tz);EXPECT_SYSZR(1, gettimeofday(&tv, &tz)); break;
> 
> May still require to add 'used' attribute to 'struct timeval tv' and 'struct
> timeval tz' to let compiler not optimize them away.

Maybe, or turn them to volatile as well.

> For the waitid syscall based waitpid INT_MIN test case, I have prepared such
> code:
> 
>     #define IF_TEST(name) \
>     	if (strcmp(test, #name) == 0)
> 
>     const int _errorno(const char *test)
>     {
>     #ifdef __NR_wait4
>     	IF_TEST(waitpid_min); return ESRCH;
>     #else /* __NR_waitid */
>     	IF_TEST(waitpid_min); return EINVAL;
>     #endif
>     	return 0;
>     }
> 
>     #define errorno(test) _errorno(#test)
> 
>     CASE_TEST(waitpid_min);       EXPECT_SYSER(1, waitpid(INT_MIN, &tmp, WNOHANG), -1, errorno(waitpid_min)); break;
> 
> Instead of simply disabling this case, the above code allows to return
> different values for different syscalls.

I don't like this, it gets particularly complicated to follow, especially
since it doesn't rely on the underlying syscall but on which ones are
defined, and supposes that the underlying implementation will use exactly
these ones. Do not forget that we're not trying to verify that the tests
provoke a specific syscall return, but that our syscall implementation
returns the errno the application expects. If we see that one of them
breaks, it means either that our test is wrong or undefined, or that our
mapping of the syscall is incorrect. But in either case it indicates that
an application relying on a specific errno would see a different value.

Many syscalls can return various values among a set, depending on which
error is tested first. If that's the case for the ones above, I'd largely
prefer to have EXPECT_SYSER2() that accepts any errno among a set of two
(and maybe layer EXPECT_SYSER3() if 3 errno are possible).

Also there's something to keep in mind: nolibc-test is just one userland
application among others. This means that every time you need to modify
it to shut up an error that pops up after a change to nolibc, it means
that you're possibly breaking one application living on an edge case and
explicitly checking for that errno value. It is not necessarily dramatic
but that's still something to keep in mind. We've all seen some of our
code fail after a syscall started to report a new errno value we didn't
expect, so it's important to still be cautious here and not to rely too
much on the ease of adapting error handling in nolibc-test.

> Thanks very much and I have seen another two have been pushed too, will rebase
> everything on this new branch.

OK.

> Based on the other suggestions from you and Thomas, I plan to send some generic
> and independent changes at first, and then the left hard parts, It may simplify
> the whole progress.

Yes, thank you! As a general rule of thumb (which makes the handling
easier for everyone including you), the least controversial changes should
be proposed first. This often allows to merge the first half of the patches
at once and saves you from having to reorder what's left.

Willy
Zhangjin Wu May 28, 2023, 12:52 p.m. UTC | #8
> On Sun, May 28, 2023 at 06:39:57PM +0800, Zhangjin Wu wrote:
> > > I have read the comments that others made on the series and overall
> > > agree. I've seen that you intend to prepare a v2. I think we must
> > > first decide how to better deal with emulated syscalls as I said in
> > > an earlier message. Probably that we should just add a specific test
> > > case for EFAULT in nolibc-test since it's the only one (I think) that
> > > risks to trigger crashes with emulated syscalls. We could also imagine
> > > dealing with the signal ourselves but I'm not that keen on going to
> > > implement signal() & longjmp() for now :-/
> > >
> > 
> > Yes, user-space signal() may be the right direction, we just need to let
> > user-space not crash the kernel, what about this 'solution' for current stage
> > (consider the pure time64 support too):
> > 
> >     #if defined(NOLIBC) && defined(__NR_gettimeofday) && __SIZEOF_LONG__ == 8
> > 		CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break;
> > 		CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;
> >     #endif
> > 
> > This idea is from your commit 1da02f51088 ("selftests/nolibc: support glibc as
> > well") for glibc, but the difference is of course glibc not crashes the kernel.
> 
> Well, I was imagining implementing an EXPECT_EFAULT() macro that would
> rely on whatever other macros we'd set to indicate that a syscall got
> remapped. But I had another check grepping for EFAULT:
> 
>       CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break;
>       CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;
>       CASE_TEST(poll_fault);        EXPECT_SYSER(1, poll((void *)1, 1, 0), -1, EFAULT); break;
>       CASE_TEST(prctl);             EXPECT_SYSER(1, prctl(PR_SET_NAME, (unsigned long)NULL, 0, 0, 0), -1, EFAULT); break;
>       CASE_TEST(select_fault);      EXPECT_SYSER(1, select(1, (void *)1, NULL, NULL, 0), -1, EFAULT); break;
>       CASE_TEST(stat_fault);        EXPECT_SYSER(1, stat(NULL, &stat_buf), -1, EFAULT); break;
>       CASE_TEST(syscall_args);      EXPECT_SYSER(1, syscall(__NR_fstat, 0, NULL), -1, EFAULT); break;
> 
> In short, they're very few, and several of these could simply be dropped
> as irrelevant once we know that the libc is able to remap them and
> dereference the arguments itself.
> 
> I'd be fine with dropping the two gettimeofday_bad ones, poll_fault,
> select_fault and stat_fault. These ones already have at least one or
> two other tests. These ones were initially added because they were
> easy to implement, but if they're not relevant we can drop them and
> stop wondering how to hack around the tests.
> 
> If that's OK for you as well I can do that.
>

The dropping of the others is not required, since currently, we only
found these two gettimeofday test cases have such issues when we
implement them with clock_gettime/time64, because there is a "timespec
to timeval" conversion in user-space, if the data pointer could be
passed to kernel space, there would be no such issues (kernel will
transfer data via put_user() helpers).

> > Btw, since the gettimeofday_null case may be optimized by compiler and not
> > trigger such errors:
> > 
> >     // rv32
> >     nolibc-test.c:(.text.run_syscall+0x8c0): undefined reference to `__divdi3'
> > 
> >     // arm32
> >     nolibc-test.c:(.text.run_syscall+0x820): undefined reference to `__aeabi_ldivmod'
> > 
> > The above errors have been hidden after the disabling of the gettimeofday_bad1
> > test case, so, still need to solve it before sending v2.
> 
> Sorry, I don't understand what you mean, I'm not seeing such a divide in
> the code. Or maybe you're speaking about what you got after some of your
> proposed changes ?
> 
> > The method used by musl may work, but the high bits may be lost (from long long
> > to int)?
> >  
> > 	tv->tv_usec = (int)ts.tv_nsec / 1000;
> 
> Yes, and it would be even cleaner to use a uint here since tv_nsec is
> always positive. This will simply result in a multiplication and a
> shift on most platforms. Of course that's the type of thing you normally
> don't want on a fast path for some small systems but here code compacity
> counts more and that's fine.
>

Ok, will use uint here.

> > Perhaps we really need to add the missing __divdi3 and __aeabi_ldivmod and the
> > ones for the other architectures, or get one from lib/math/div64.c.
> 
> No, these ones come from the compiler via libgcc_s, we must not try to
> reimplement them. And we should do our best to avoid depending on them
> to avoid the error you got above.
> 
> > Will add such new test cases to detect the above issues:
> > 
> >     CASE_TEST(gettimeofday_tv);   EXPECT_SYSZR(1, gettimeofday(&tv, NULL)); break;
> >     CASE_TEST(gettimeofday_tz);   EXPECT_SYSZR(1, gettimeofday(NULL, &tz)); break;
> >     CASE_TEST(gettimeofday_tv_tz);EXPECT_SYSZR(1, gettimeofday(&tv, &tz)); break;
> > 
> > May still require to add 'used' attribute to 'struct timeval tv' and 'struct
> > timeval tz' to let compiler not optimize them away.
> 
> Maybe, or turn them to volatile as well.
>

Yeah, volatile is better.

> > For the waitid syscall based waitpid INT_MIN test case, I have prepared such
> > code:
> > 
> >     #define IF_TEST(name) \
> >     	if (strcmp(test, #name) == 0)
> > 
> >     const int _errorno(const char *test)
> >     {
> >     #ifdef __NR_wait4
> >     	IF_TEST(waitpid_min); return ESRCH;
> >     #else /* __NR_waitid */
> >     	IF_TEST(waitpid_min); return EINVAL;
> >     #endif
> >     	return 0;
> >     }
> > 
> >     #define errorno(test) _errorno(#test)
> > 
> >     CASE_TEST(waitpid_min);       EXPECT_SYSER(1, waitpid(INT_MIN, &tmp, WNOHANG), -1, errorno(waitpid_min)); break;
> > 
> > Instead of simply disabling this case, the above code allows to return
> > different values for different syscalls.
> 
> I don't like this, it gets particularly complicated to follow, especially
> since it doesn't rely on the underlying syscall but on which ones are
> defined, and supposes that the underlying implementation will use exactly
> these ones. Do not forget that we're not trying to verify that the tests
> provoke a specific syscall return, but that our syscall implementation
> returns the errno the application expects. If we see that one of them
> breaks, it means either that our test is wrong or undefined, or that our
> mapping of the syscall is incorrect. But in either case it indicates that
> an application relying on a specific errno would see a different value.
> 
> Many syscalls can return various values among a set, depending on which
> error is tested first. If that's the case for the ones above, I'd largely
> prefer to have EXPECT_SYSER2() that accepts any errno among a set of two
> (and maybe layer EXPECT_SYSER3() if 3 errno are possible).
>

Ok.

> Also there's something to keep in mind: nolibc-test is just one userland
> application among others. This means that every time you need to modify
> it to shut up an error that pops up after a change to nolibc, it means
> that you're possibly breaking one application living on an edge case and
> explicitly checking for that errno value. It is not necessarily dramatic
> but that's still something to keep in mind. We've all seen some of our
> code fail after a syscall started to report a new errno value we didn't
> expect, so it's important to still be cautious here and not to rely too
> much on the ease of adapting error handling in nolibc-test.
> 
> > Thanks very much and I have seen another two have been pushed too, will rebase
> > everything on this new branch.
> 
> OK.
> 
> > Based on the other suggestions from you and Thomas, I plan to send some generic
> > and independent changes at first, and then the left hard parts, It may simplify
> > the whole progress.
> 
> Yes, thank you! As a general rule of thumb (which makes the handling
> easier for everyone including you), the least controversial changes should
> be proposed first. This often allows to merge the first half of the patches
> at once and saves you from having to reorder what's left.
>

That's true.

Thanks,
Zhangjin

> Willy
Thomas Weißschuh May 28, 2023, 1:45 p.m. UTC | #9
Hi Zhangjin,


May 28, 2023 12:40:31 Zhangjin Wu <falcon@tinylab.org>:

> Hi, Willy
>
>> Hi Zhangjin,
>>
>> On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
>>> Hi, Willy
>>>
>>> Thanks very mush for your kindly review, discuss and suggestion, now we
>>> get full rv32 support ;-)
>>>
>>> In the first series [1], we have fixed up the compile errors about
>>> _start and __NR_llseek for rv32, but left compile errors about tons of
>>> time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use
>>> latest system call ABI")) and the missing fstat in nolibc-test.c [2],
>>> now we have fixed up all of them.
>>
>> (...)
>>
>> I have read the comments that others made on the series and overall
>> agree. I've seen that you intend to prepare a v2. I think we must
>> first decide how to better deal with emulated syscalls as I said in
>> an earlier message. Probably that we should just add a specific test
>> case for EFAULT in nolibc-test since it's the only one (I think) that
>> risks to trigger crashes with emulated syscalls. We could also imagine
>> dealing with the signal ourselves but I'm not that keen on going to
>> implement signal() & longjmp() for now :-/
>>
>
> Yes, user-space signal() may be the right direction, we just need to let
> user-space not crash the kernel, what about this 'solution' for current stage
> (consider the pure time64 support too):

If you did manage to crash the actual kernel than that would be a bug in the kernel that needs to be fixed.
Feel free to describe how it happened and I'll take a look.

Thomas
Zhangjin Wu May 28, 2023, 6:39 p.m. UTC | #10
> May 28, 2023 12:40:31 Zhangjin Wu <falcon@tinylab.org>:
>
> > Hi, Willy
> >
> >> Hi Zhangjin,
> >>
> >> On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
> >>> Hi, Willy
> >>>
> >>> Thanks very mush for your kindly review, discuss and suggestion, now we
> >>> get full rv32 support ;-)
> >>>
> >>> In the first series [1], we have fixed up the compile errors about
> >>> _start and __NR_llseek for rv32, but left compile errors about tons of
> >>> time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use
> >>> latest system call ABI")) and the missing fstat in nolibc-test.c [2],
> >>> now we have fixed up all of them.
> >>
> >> (...)
> >>
> >> I have read the comments that others made on the series and overall
> >> agree. I've seen that you intend to prepare a v2. I think we must
> >> first decide how to better deal with emulated syscalls as I said in
> >> an earlier message. Probably that we should just add a specific test
> >> case for EFAULT in nolibc-test since it's the only one (I think) that
> >> risks to trigger crashes with emulated syscalls. We could also imagine
> >> dealing with the signal ourselves but I'm not that keen on going to
> >> implement signal() & longjmp() for now :-/
> >>
> >
> > Yes, user-space signal() may be the right direction, we just need to let
> > user-space not crash the kernel, what about this 'solution' for current stage
> > (consider the pure time64 support too):
>
> If you did manage to crash the actual kernel than that would be a bug in the kernel that needs to be fixed.
> Feel free to describe how it happened and I'll take a look.
>

Sorry, my description above is not really right, the sigsegv (11) signal will
be sent to our program when it tries to write something to the address: (void
*)1 for this test case tries to do/test so:

    CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break;

In the original gettimeofday syscall based implementation, the kernel
side tries to use put_user to copy timespec data to user space timeval:

    kernel/time/time.c:

    if (put_user(ts.tv_sec, &tv->tv_sec) ||
	put_user(ts.tv_nsec / 1000, &tv->tv_usec))
	return -EFAULT;

The put_user() in arch/riscv/include/asm/uaccess.h will trigger the
'fixup' logic and return -EFAULT and let the other test cases continue.

But if add our clock_gettime/time64 syscalls based implementation, we must get
timespec from kernel space and then convert them to timeval in user space, the
address of timespec can be handled by kernel space too, but we must write them
to the address of timeval in user-space:

    tv->tv_sec = ts.tv_sec;
    tv->tv_usec = (unsigned int)ts.tv_nsec / 1000;

In above test case, tv above is something like (void *)1, it is invalid, kernel
will prevent writing and force send a sigsegv and stop the program:

   35 gettimeofday_bad1init[1]: unhandled signal 11 code 0x1 at 0x00000002 in init[10000+5000]
        CPU: 0 PID: 1 Comm: init Not tainted 6.4.0-rc1-00137-gfdc311fa22ed-dirty #60
        Hardware name: riscv-virtio,qemu (DT)
        epc : 00012c90 ra : 00012c6c sp : 9d097d90
         gp : 00016800 tp : 00000000 t0 : 00000000
         t1 : 0000000a t2 : 00000000 s0 : 00000001
         s1 : 00016008 a0 : 00000000 a1 : 9d097da8
         a2 : 00000014 a3 : 00000000 a4 : 00000000
         a5 : 00000000 a6 : 00000001 a7 : 00000193
         s2 : 00000023 s3 : 00000000 s4 : 9d097da4
         s5 : 00000000 s6 : 0000541b s7 : 00000007
         s8 : 9d097dcc s9 : 00014474 s10: 00016000
         s11: 00000006 t3 : 00000000 t4 : ffffffff
         t5 : 00000000 t6 : 00000000
        status: 00000020 badaddr: 00000002 cause: 0000000f
        Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Because our test run nolibc-test as init of initramfs on qemu, when init exit
but not reboot as normally, then it 'crashes' the kernel (kernel panic above).

If we have sigaction()/sigsetjmp/siglongjump support, then, we can call
'reboot()' in sigsegv signal handler, and event let it continue the other test
cases. sigaction seems only work to trigger when to call siglongjump,
siglongjump ask sigsetjmp to do the real recover action.

I did find some useful urls, and wrote such an exception restore logic, not
completely, not support NOLIBC_TEST environment variables yet.

    diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
    index 001ea789fa39..e7d488722e14 100644
    --- a/tools/testing/selftests/nolibc/nolibc-test.c
    +++ b/tools/testing/selftests/nolibc/nolibc-test.c
    @@ -110,6 +110,47 @@ const char *errorname(int err)
     	}
     }

    +#if !defined(NOLIBC)
    +#include <setjmp.h>
    +int test_base = 0;
    +int test_number = 0;
    +int test_llen = 0;
    +sigjmp_buf mark;
    +typedef int (*func_t)(int min, int max);
    +func_t test_func = NULL;
    +int test_idx = 0;
    +
    +static int pad_spc(int llen, int cnt, const char *fmt, ...);
    +static const struct test test_names[];
    +
    +void continue_run(void)
    +{
    +	int idx;
    +	int err;
    +	int ret;
    +	int min = 0;
    +	int max = INT_MAX;
    +
    +	test_llen += printf(" = %d ", -1);
    +	pad_spc(test_llen, 64, "[FAIL] (continued by sigaction/siglongjmp/sigsetjmp)\n");
    +	test_func = test_names[test_idx].func;
    +	test_func(test_number - test_base + 1, 1000);
    +
    +	for (idx = test_idx + 1; test_names[idx].name; idx++) {
    +		printf("Running test '%s'\n", test_names[idx].name);
    +		err = test_names[idx].func(min, max);
    +		ret += err;
    +		printf("Errors during this test: %d\n\n", err);
    +	}
    +}
    +
    +void action(int sig, siginfo_t *si, void *p)
    +{
    +	if (sig != SIGKILL)
    +		siglongjmp(mark, -1);
    +}
    +#endif
    +
     #define IF_TEST(name) \
     	if (strcmp(test, #name) == 0)

    @@ -447,8 +488,13 @@ static int expect_strne(const char *expr, int llen, const char *cmp)


     /* declare tests based on line numbers. There must be exactly one test per line. */
    +#if !defined(NOLIBC)
    +#define CASE_TEST(name) \
    +	case __LINE__: test_number = __LINE__; if (strcmp(#name, "getpid") == 0) { test_base = test_number; } llen += printf("%d %s", test, #name); test_llen = llen;
    +#else
     #define CASE_TEST(name) \
     	case __LINE__: llen += printf("%d %s", test, #name);
    +#endif

     /* used by some syscall tests below */
     int test_getdents64(const char *dir)
    @@ -582,7 +628,7 @@ int run_syscall(int min, int max)
     		CASE_TEST(gettimeofday_tv);   EXPECT_SYSZR(1, gettimeofday(&tv, NULL)); break;
     		CASE_TEST(gettimeofday_tz);   EXPECT_SYSZR(1, gettimeofday(NULL, &tz)); break;
     		CASE_TEST(gettimeofday_tv_tz);EXPECT_SYSZR(1, gettimeofday(&tv, &tz)); break;
    -#ifdef NOLIBC
    +#if 1
     		CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break;
     		CASE_TEST(gettimeofday_bad2); EXPECT_SYSER(1, gettimeofday(NULL, (void *)1), -1, EFAULT); break;
     #endif
    @@ -952,6 +998,22 @@ int main(int argc, char **argv, char **envp)
     	if (getpid() == 1)
     		prepare();

    +#if !defined(NOLIBC)
    +	struct sigaction sa = {0};
    +	sa.sa_sigaction = action;
    +	sa.sa_flags = SA_SIGINFO;
    +	ret = sigaction(SIGSEGV, &sa, NULL);
    +	if (ret == -1) {
    +		perror("sigaction");
    +		exit(1);
    +	}
    +
    +	if (sigsetjmp(mark, 1) != 0) {
    +		continue_run();
    +		exit(0);
    +	}
    +#endif
    +
     	/* the definition of a series of tests comes from either argv[1] or the
     	 * "NOLIBC_TEST" environment variable. It's made of a comma-delimited
     	 * series of test names and optional ranges:
    @@ -1008,6 +1070,9 @@ int main(int argc, char **argv, char **envp)

     					/* now's time to call the test */
     					printf("Running test '%s'\n", test_names[idx].name);
    +#if !defined(NOLIBC)
    +					test_idx = idx;
    +#endif
     					err = test_names[idx].func(min, max);
     					ret += err;
     					printf("Errors during this test: %d\n\n", err);
    @@ -1021,6 +1086,9 @@ int main(int argc, char **argv, char **envp)
     		/* no test mentioned, run everything */
     		for (idx = 0; test_names[idx].name; idx++) {
     			printf("Running test '%s'\n", test_names[idx].name);
    +#if !defined(NOLIBC)
    +			test_idx = idx;
    +#endif
     			err = test_names[idx].func(min, max);
     			ret += err;
     			printf("Errors during this test: %d\n\n", err);

usage:

    $ gcc -o nolibc-test tools/testing/selftests/nolibc/nolibc-test.c
    $ ./nolibc-test
    ...
    35 gettimeofday_tz = 0                                           [OK]
    36 gettimeofday_tv_tz = 0                                        [OK]
    37 gettimeofday_bad1 = -1                                       [FAIL] (continued by sigaction/siglongjmp/sigsetjmp)
    38 gettimeofday_bad2 = -1                                       [FAIL] (continued by sigaction/siglongjmp/sigsetjmp)
    39 getpagesize = 0                                               [OK]
    40 ioctl_tiocinq = 0                                             [OK]
    41 ioctl_tiocinq = 0                                             [OK]
    ...

It did work as expected, but for nolibc, we still need to add sigaction/siglongjump/sigsetjmp support.

Will send a patch based on Willy's latest branch, perhaps this may help us to
verify the future sigaction/siglongjump/sigsetjmp for nolibc.

ref: https://www.ibm.com/docs/en/i/7.1?topic=ssw_ibm_i_71/apis/sigsetj.html
     https://www.ibm.com/docs/en/zos/2.1.0?topic=functions-siglongjmp-restore-stack-environment-signal-mask

Best regards,
Zhangjin

> Thomas
Thomas Weißschuh May 29, 2023, 8:45 a.m. UTC | #11
Hi Zhangjin,

On 2023-05-29 02:39:06+0800, Zhangjin Wu wrote:
> > May 28, 2023 12:40:31 Zhangjin Wu <falcon@tinylab.org>:
> > >> On Thu, May 25, 2023 at 01:33:14AM +0800, Zhangjin Wu wrote:
> > >>> Hi, Willy
> > >>>
> > >>> Thanks very mush for your kindly review, discuss and suggestion, now we
> > >>> get full rv32 support ;-)
> > >>>
> > >>> In the first series [1], we have fixed up the compile errors about
> > >>> _start and __NR_llseek for rv32, but left compile errors about tons of
> > >>> time32 syscalls (removed after kernel commit d4c08b9776b3 ("riscv: Use
> > >>> latest system call ABI")) and the missing fstat in nolibc-test.c [2],
> > >>> now we have fixed up all of them.
> > >>
> > >> (...)
> > >>
> > >> I have read the comments that others made on the series and overall
> > >> agree. I've seen that you intend to prepare a v2. I think we must
> > >> first decide how to better deal with emulated syscalls as I said in
> > >> an earlier message. Probably that we should just add a specific test
> > >> case for EFAULT in nolibc-test since it's the only one (I think) that
> > >> risks to trigger crashes with emulated syscalls. We could also imagine
> > >> dealing with the signal ourselves but I'm not that keen on going to
> > >> implement signal() & longjmp() for now :-/
> > >>
> > >
> > > Yes, user-space signal() may be the right direction, we just need to let
> > > user-space not crash the kernel, what about this 'solution' for current stage
> > > (consider the pure time64 support too):
> >
> > If you did manage to crash the actual kernel than that would be a bug in the kernel that needs to be fixed.
> > Feel free to describe how it happened and I'll take a look.
> >
> 
> Sorry, my description above is not really right, the sigsegv (11) signal will
> be sent to our program when it tries to write something to the address: (void
> *)1 for this test case tries to do/test so:
> 
>     CASE_TEST(gettimeofday_bad1); EXPECT_SYSER(1, gettimeofday((void *)1, NULL), -1, EFAULT); break;

<snip>

>    35 gettimeofday_bad1init[1]: unhandled signal 11 code 0x1 at 0x00000002 in init[10000+5000]
>         CPU: 0 PID: 1 Comm: init Not tainted 6.4.0-rc1-00137-gfdc311fa22ed-dirty #60
>         Hardware name: riscv-virtio,qemu (DT)
>         epc : 00012c90 ra : 00012c6c sp : 9d097d90
>          gp : 00016800 tp : 00000000 t0 : 00000000
>          t1 : 0000000a t2 : 00000000 s0 : 00000001
>          s1 : 00016008 a0 : 00000000 a1 : 9d097da8
>          a2 : 00000014 a3 : 00000000 a4 : 00000000
>          a5 : 00000000 a6 : 00000001 a7 : 00000193
>          s2 : 00000023 s3 : 00000000 s4 : 9d097da4
>          s5 : 00000000 s6 : 0000541b s7 : 00000007
>          s8 : 9d097dcc s9 : 00014474 s10: 00016000
>          s11: 00000006 t3 : 00000000 t4 : ffffffff
>          t5 : 00000000 t6 : 00000000
>         status: 00000020 badaddr: 00000002 cause: 0000000f
>         Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> 
> Because our test run nolibc-test as init of initramfs on qemu, when init exit
> but not reboot as normally, then it 'crashes' the kernel (kernel panic above).

This makes sense, thanks. I just wanted to make sure no kernel bugs were
going unhandeld.

> If we have sigaction()/sigsetjmp/siglongjump support, then, we can call
> 'reboot()' in sigsegv signal handler, and event let it continue the other test
> cases. sigaction seems only work to trigger when to call siglongjump,
> siglongjump ask sigsetjmp to do the real recover action.
> 
> I did find some useful urls, and wrote such an exception restore logic, not
> completely, not support NOLIBC_TEST environment variables yet.

<lots of implementation>

> usage:
> 
>     $ gcc -o nolibc-test tools/testing/selftests/nolibc/nolibc-test.c
>     $ ./nolibc-test
>     ...
>     35 gettimeofday_tz = 0                                           [OK]
>     36 gettimeofday_tv_tz = 0                                        [OK]
>     37 gettimeofday_bad1 = -1                                       [FAIL] (continued by sigaction/siglongjmp/sigsetjmp)
>     38 gettimeofday_bad2 = -1                                       [FAIL] (continued by sigaction/siglongjmp/sigsetjmp)
>     39 getpagesize = 0                                               [OK]
>     40 ioctl_tiocinq = 0                                             [OK]
>     41 ioctl_tiocinq = 0                                             [OK]
>     ...
> 
> It did work as expected, but for nolibc, we still need to add sigaction/siglongjump/sigsetjmp support.
> 
> Will send a patch based on Willy's latest branch, perhaps this may help us to
> verify the future sigaction/siglongjump/sigsetjmp for nolibc.
> 
> ref: https://www.ibm.com/docs/en/i/7.1?topic=ssw_ibm_i_71/apis/sigsetj.html
>      https://www.ibm.com/docs/en/zos/2.1.0?topic=functions-siglongjmp-restore-stack-environment-signal-mask

This seems very complicated for fairly limited gain to be honest.

If we really want to keep the current testcase we could also ensure that
the pointer does not fall into the first page, as the first page is not
mapped under Linux:

0 <= addr < PAGE_SIZE

Or instead of PAGE_SIZE just hardcode 4096, as that should be the
minimum size and and does not require a lookup.

Thomas
Willy Tarreau May 29, 2023, 11:31 a.m. UTC | #12
Hi Thomas,

On Mon, May 29, 2023 at 10:45:40AM +0200, Thomas Weißschuh wrote:
> <lots of implementation>
> 
> > usage:
> > 
> >     $ gcc -o nolibc-test tools/testing/selftests/nolibc/nolibc-test.c
> >     $ ./nolibc-test
> >     ...
> >     35 gettimeofday_tz = 0                                           [OK]
> >     36 gettimeofday_tv_tz = 0                                        [OK]
> >     37 gettimeofday_bad1 = -1                                       [FAIL] (continued by sigaction/siglongjmp/sigsetjmp)
> >     38 gettimeofday_bad2 = -1                                       [FAIL] (continued by sigaction/siglongjmp/sigsetjmp)
> >     39 getpagesize = 0                                               [OK]
> >     40 ioctl_tiocinq = 0                                             [OK]
> >     41 ioctl_tiocinq = 0                                             [OK]
> >     ...
> > 
> > It did work as expected, but for nolibc, we still need to add sigaction/siglongjump/sigsetjmp support.
> > 
> > Will send a patch based on Willy's latest branch, perhaps this may help us to
> > verify the future sigaction/siglongjump/sigsetjmp for nolibc.
> > 
> > ref: https://www.ibm.com/docs/en/i/7.1?topic=ssw_ibm_i_71/apis/sigsetj.html
> >      https://www.ibm.com/docs/en/zos/2.1.0?topic=functions-siglongjmp-restore-stack-environment-signal-mask
> 
> This seems very complicated for fairly limited gain to be honest.

I agree as well. I'm not denying the fact that one day we may want to
support signal, longjmp and friends but I'm not convinced we want to
go through that just to make a few uncertain tests succeed.

> If we really want to keep the current testcase we could also ensure that
> the pointer does not fall into the first page, as the first page is not
> mapped under Linux:
> 
> 0 <= addr < PAGE_SIZE
> 
> Or instead of PAGE_SIZE just hardcode 4096, as that should be the
> minimum size and and does not require a lookup.

I would not even do that. It brings nothing to the application layer and
inflates the code. I'd rather just get rid of the EFAULT test cases that
rely on an unreliable syscall (i.e. one that may either be a real syscall
or an emulated one). The value brought by these tests is extremely low
and they were implemented only because they were easy to do. If they're
causing pain, let's just drop them.

Willy