mbox series

[-next,v18,00/20] riscv: Add vector ISA support

Message ID 20230414155843.12963-1-andy.chiu@sifive.com (mailing list archive)
Headers show
Series riscv: Add vector ISA support | expand

Message

Andy Chiu April 14, 2023, 3:58 p.m. UTC
This patchset is implemented based on vector 1.0 spec to add vector support
in riscv Linux kernel. There are some assumptions for this implementations.

1. We assume all harts has the same ISA in the system.
2. We disable vector in both kernel and user space [1] by default. Only
   enable an user's vector after an illegal instruction trap where it
   actually starts executing vector (the first-use trap [2]).
3. We detect "riscv,isa" to determine whether vector is support or not.

We defined a new structure __riscv_v_ext_state in struct thread_struct to
save/restore the vector related registers. It is used for both kernel space
and user space.
 - In kernel space, the datap pointer in __riscv_v_ext_state will be
   allocated to save vector registers.
 - In user space,
	- In signal handler of user space, the structure is placed
	  right after __riscv_ctx_hdr, which is embedded in fp reserved
	  aera. This is required to avoid ABI break [2]. And datap points
	  to the end of __riscv_v_ext_state.
	- In ptrace, the data will be put in ubuf in which we use
	  riscv_vr_get()/riscv_vr_set() to get or set the
	  __riscv_v_ext_state data structure from/to it, datap pointer
	  would be zeroed and vector registers will be copied to the
	  address right after the __riscv_v_ext_state structure in ubuf.

This patchset is rebased to v6.3-rc1 and it is tested by running several
vector programs simultaneously. It delivers signals correctly in a test
where we can see a valid ucontext_t in a signal handler, and a correct V
context returing back from it. And the ptrace interface is tested by
PTRACE_{GET,SET}REGSET. Lastly, KVM is tested by running above tests in
a guest using the same kernel image. All tests are done on an rv64gcv
virt QEMU.

Note: please apply the patch at [4] due to a regression introduced by
commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
optimizations") before testing the series.

Source tree:
https://github.com/sifive/riscv-linux/tree/riscv/for-next/vector-v18

Updated patches: 7, 10, 11
New patches: (none)
Unchanged patches: 1, 2, 3, 4, 5, 6, 8, 9, 12, 13, 14, 15, 16, 17, 18, 19, 20

Thanks in advance for your kind feedback!
Regards,
Andy Chiu

Links:
 - [1] https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/
 - [2] https://lore.kernel.org/all/73c0124c-4794-6e40-460c-b26df407f322@rivosinc.com/T/#u
 - [3] https://lore.kernel.org/all/20230128082847.3055316-1-apatel@ventanamicro.com/
 - [4] https://lore.kernel.org/all/CAHk-=wiAxtKyxs6BPEzirrXw1kXJ-7ZyGpgOrbzhmC=ud-6jBA@mail.gmail.com/

---
Changelog V18
 - Rebase to the latest -next branch (at 9c2598d)
 - patch 7: Detect inconsistent VLEN setup on an SMP system (Heiko).
 - patch 10: Add blank lines (Heiko)
 - patch 10: Return immediately in insn_is_vector() if an insn matches (Heiko)
 - patch 11: Use sizeof(vstate->datap) instead of sizeof(void*) (Eike)

Changelog V17
 - Rebase to the latest -next branch (at e45d6a5):
   - Solve conflicts at 9 and 13 due to generic entry
   - Use generic entry in do_trap_insn_illegal() trap handler

Changelog V16
 - Rebase to the latest for-next (at 4b74077):
 - Solve conflicts at 7, and 17
 - Use as-instr to detect if assembler supports .option arch directive
   and remove dependency from GAS, for both ZBB and V.
 - Cleanup code in KVM vector
 - Address issue reported by sparse
 - Refine code:
   - Fix a mixed-use of space/tab
   - Remove new lines at the end of file

Changelog V15
 - Rebase to risc-v -next (v6.3-rc1)
 - Make V depend on FD in Kconfig according to the spec and shut off v
   properly.
 - Fix a syntax error for clang build. But mark RISCV_ISA_V GAS only due
   to https://reviews.llvm.org/D123515
 - Use scratch reg in inline asm instead of t4.
 - Refine code.
 - Cleanup per-patch changelogs.

Changelog V14
 - Rebase to risc-v -next (v6.2-rc7)
 - Use TOOLCHAIN_HAS_V to detect if we can enable Vector. And refine
   KBUILD_CFLAGS to remove v from default compile option.
 - Drop illegal instruction handling patch in kvm and leave it to a
   independent series[3]. The series has merged into 6.3-rc1
 - Move KVM_RISCV_ISA_EXT_V to the end of enum to prevent potential ABI
   breaks.
 - Use PT_SIZE_ON_STACK instead of PT_SIZE to fit alignment. Also,
   remove panic log from v13 (15/19) because it is no longer relevant.
 - Rewrite insn_is_vector for better structuring (change if-else chain to
   a switch)
 - Fix compilation error in the middle of the series
 - Validate size of the alternative signal frame if V is enabled
   whenever:
     - The user call sigaltstack to update altstack
     - A signal is being delivered
 - Rename __riscv_v_state to __riscv_v_ext_state.
 - Add riscv_v_ prefix and rename rvv appropriately
 - Organize riscv_v_vsize setup code into vector.c
 - Address the issue mentioned by Heiko on !FPU case
 - Honor orignal authors that got changed accidentally in v13 4,5,6

Changelog V13
 - Rebase to latest risc-v next (v6.2-rc1)
 - vineetg: Re-organize the series to comply with bisect-ability
 - andy.chiu: Improve task switch with inline assembly
 - Re-structure the signal frame to avoid user ABI break.
 - Implemnt first-use trap and drop prctl for per-task V state
   enablement. Also, redirect this trap from hs to vs for kvm setup.
 - Do not expose V context in ptrace/sigframe until the task start using
   V. But still reserve V context for size ofsigframe reported by auxv.
 - Drop the kernel mode vector and leave it to another (future) series.

Changelog V12 (Chris)
 - rebases to some point after v5.18-rc6
 - add prctl to control per-process V state

Chnagelog V10
 - Rebase to v5.18-rc6
 - Merge several patches
 - Refine codes
 - Fix bugs
 - Add kvm vector support

Changelog V9
 - Rebase to v5.15
 - Merge several patches
 - Refine codes
 - Fix a kernel panic issue

Changelog V8
 - Rebase to v5.14
 - Refine struct __riscv_v_ext_state with struct __riscv_ctx_hdr
 - Refine has_vector into a static key
 - Defined __reserved space in struct sigcontext for vector and future extensions

Changelog V7
 - Add support for kernel mode vector
 - Add vector extension XOR implementation
 - Optimize task switch codes of vector
 - Allocate space for vector registers in start_thread()
 - Fix an illegal instruction exception when accessing vlenb
 - Optimize vector registers initialization
 - Initialize vector registers with proper vsetvli then it can work normally
 - Refine ptrace porting due to generic API changed
 - Code clean up

Changelog V6
 - Replace vle.v/vse.v instructions with vle8.v/vse8.v based on 0.9 spec
 - Add comments based on mailinglist feedback
 - Fix rv32 build error

Changelog V5
 - Using regset_size() correctly in generic ptrace
 - Fix the ptrace porting
 - Fix compile warning

Changelog V4
 - Support dynamic vlen
 - Fix bugs: lazy save/resotre, not saving vtype
 - Update VS bit offset based on latest vector spec
 - Add new vector csr based on latest vector spec
 - Code refine and removed unused macros

Changelog V3
 - Rebase linux-5.6-rc3 and tested with qemu
 - Seperate patches with Anup's advice
 - Give out a ABI puzzle with unlimited vlen

Changelog V2
 - Fixup typo "vecotr, fstate_save->vstate_save".
 - Fixup wrong saved registers' length in vector.S.
 - Seperate unrelated patches from this one.


*** BLURB HERE ***

Andy Chiu (4):
  riscv: Allocate user's vector context in the first-use trap
  riscv: signal: check fp-reserved words unconditionally
  riscv: signal: validate altstack to reflect Vector
  riscv: detect assembler support for .option arch

Greentime Hu (9):
  riscv: Add new csr defines related to vector extension
  riscv: Clear vector regfile on bootup
  riscv: Introduce Vector enable/disable helpers
  riscv: Introduce riscv_v_vsize to record size of Vector context
  riscv: Introduce struct/helpers to save/restore per-task Vector state
  riscv: Add task switch support for vector
  riscv: Add ptrace vector support
  riscv: signal: Add sigcontext save/restore for vector
  riscv: prevent stack corruption by reserving task_pt_regs(p) early

Guo Ren (4):
  riscv: Rename __switch_to_aux() -> fpu
  riscv: Extending cpufeature.c to detect V-extension
  riscv: Disable Vector Instructions for kernel itself
  riscv: Enable Vector code to be built

Vincent Chen (3):
  riscv: signal: Report signal frame size to userspace via auxv
  riscv: kvm: Add V extension to KVM ISA
  riscv: KVM: Add vector lazy save/restore support

 arch/riscv/Kconfig                       |  28 ++-
 arch/riscv/Makefile                      |   6 +-
 arch/riscv/include/asm/csr.h             |  18 +-
 arch/riscv/include/asm/elf.h             |   9 +
 arch/riscv/include/asm/hwcap.h           |   1 +
 arch/riscv/include/asm/insn.h            |  29 +++
 arch/riscv/include/asm/kvm_host.h        |   2 +
 arch/riscv/include/asm/kvm_vcpu_vector.h |  82 +++++++++
 arch/riscv/include/asm/processor.h       |   3 +
 arch/riscv/include/asm/switch_to.h       |   9 +-
 arch/riscv/include/asm/thread_info.h     |   3 +
 arch/riscv/include/asm/vector.h          | 180 +++++++++++++++++++
 arch/riscv/include/uapi/asm/auxvec.h     |   1 +
 arch/riscv/include/uapi/asm/hwcap.h      |   1 +
 arch/riscv/include/uapi/asm/kvm.h        |   8 +
 arch/riscv/include/uapi/asm/ptrace.h     |  39 ++++
 arch/riscv/include/uapi/asm/sigcontext.h |  16 +-
 arch/riscv/kernel/Makefile               |   1 +
 arch/riscv/kernel/cpufeature.c           |  13 ++
 arch/riscv/kernel/entry.S                |   6 +-
 arch/riscv/kernel/head.S                 |  41 ++++-
 arch/riscv/kernel/process.c              |  18 ++
 arch/riscv/kernel/ptrace.c               |  70 ++++++++
 arch/riscv/kernel/setup.c                |   3 +
 arch/riscv/kernel/signal.c               | 220 ++++++++++++++++++++---
 arch/riscv/kernel/smpboot.c              |   7 +
 arch/riscv/kernel/traps.c                |  26 ++-
 arch/riscv/kernel/vector.c               | 127 +++++++++++++
 arch/riscv/kvm/Makefile                  |   1 +
 arch/riscv/kvm/vcpu.c                    |  23 +++
 arch/riscv/kvm/vcpu_vector.c             | 186 +++++++++++++++++++
 include/uapi/linux/elf.h                 |   1 +
 32 files changed, 1129 insertions(+), 49 deletions(-)
 create mode 100644 arch/riscv/include/asm/kvm_vcpu_vector.h
 create mode 100644 arch/riscv/include/asm/vector.h
 create mode 100644 arch/riscv/kernel/vector.c
 create mode 100644 arch/riscv/kvm/vcpu_vector.c

Comments

Ben Dooks April 17, 2023, 3:56 p.m. UTC | #1
On 14/04/2023 16:58, Andy Chiu wrote:
> This patchset is implemented based on vector 1.0 spec to add vector support
> in riscv Linux kernel. There are some assumptions for this implementations.
> 
> 1. We assume all harts has the same ISA in the system.
> 2. We disable vector in both kernel and user space [1] by default. Only
>     enable an user's vector after an illegal instruction trap where it
>     actually starts executing vector (the first-use trap [2]).
> 3. We detect "riscv,isa" to determine whether vector is support or not.
> 
> We defined a new structure __riscv_v_ext_state in struct thread_struct to
> save/restore the vector related registers. It is used for both kernel space
> and user space.
>   - In kernel space, the datap pointer in __riscv_v_ext_state will be
>     allocated to save vector registers.
>   - In user space,
> 	- In signal handler of user space, the structure is placed
> 	  right after __riscv_ctx_hdr, which is embedded in fp reserved
> 	  aera. This is required to avoid ABI break [2]. And datap points
> 	  to the end of __riscv_v_ext_state.
> 	- In ptrace, the data will be put in ubuf in which we use
> 	  riscv_vr_get()/riscv_vr_set() to get or set the
> 	  __riscv_v_ext_state data structure from/to it, datap pointer
> 	  would be zeroed and vector registers will be copied to the
> 	  address right after the __riscv_v_ext_state structure in ubuf.
> 
> This patchset is rebased to v6.3-rc1 and it is tested by running several
> vector programs simultaneously. It delivers signals correctly in a test
> where we can see a valid ucontext_t in a signal handler, and a correct V
> context returing back from it. And the ptrace interface is tested by
> PTRACE_{GET,SET}REGSET. Lastly, KVM is tested by running above tests in
> a guest using the same kernel image. All tests are done on an rv64gcv
> virt QEMU.

Ok, are there plans for in-kernel vector patches, or have I missed
something in this list? I expect once things like the vector-crypto
hit then people will be wanting in-kernel accelerators.
Andy Chiu April 17, 2023, 4:26 p.m. UTC | #2
Hi Ben,

On Mon, Apr 17, 2023 at 11:56 PM Ben Dooks <ben.dooks@codethink.co.uk> wrote:
>
> On 14/04/2023 16:58, Andy Chiu wrote:
> > This patchset is implemented based on vector 1.0 spec to add vector support
> > in riscv Linux kernel. There are some assumptions for this implementations.
> >
> > 1. We assume all harts has the same ISA in the system.
> > 2. We disable vector in both kernel and user space [1] by default. Only
> >     enable an user's vector after an illegal instruction trap where it
> >     actually starts executing vector (the first-use trap [2]).
> > 3. We detect "riscv,isa" to determine whether vector is support or not.
> >
> > We defined a new structure __riscv_v_ext_state in struct thread_struct to
> > save/restore the vector related registers. It is used for both kernel space
> > and user space.
> >   - In kernel space, the datap pointer in __riscv_v_ext_state will be
> >     allocated to save vector registers.
> >   - In user space,
> >       - In signal handler of user space, the structure is placed
> >         right after __riscv_ctx_hdr, which is embedded in fp reserved
> >         aera. This is required to avoid ABI break [2]. And datap points
> >         to the end of __riscv_v_ext_state.
> >       - In ptrace, the data will be put in ubuf in which we use
> >         riscv_vr_get()/riscv_vr_set() to get or set the
> >         __riscv_v_ext_state data structure from/to it, datap pointer
> >         would be zeroed and vector registers will be copied to the
> >         address right after the __riscv_v_ext_state structure in ubuf.
> >
> > This patchset is rebased to v6.3-rc1 and it is tested by running several
> > vector programs simultaneously. It delivers signals correctly in a test
> > where we can see a valid ucontext_t in a signal handler, and a correct V
> > context returing back from it. And the ptrace interface is tested by
> > PTRACE_{GET,SET}REGSET. Lastly, KVM is tested by running above tests in
> > a guest using the same kernel image. All tests are done on an rv64gcv
> > virt QEMU.
>
> Ok, are there plans for in-kernel vector patches, or have I missed
> something in this list? I expect once things like the vector-crypto
> hit then people will be wanting in-kernel accelerators.
>

Yes, I am redesigning and planning to submit the in-kernel Vector
support recently. Currently the original one is carried by Heiko's
vector crypto series. The API interface of the refined one should
remain the same but with some optimizations.

> --
> Ben Dooks                               http://www.codethink.co.uk/
> Senior Engineer                         Codethink - Providing Genius
>
> https://www.codethink.co.uk/privacy.html
>

Cheers,
Andy
Björn Töpel April 19, 2023, 7:43 a.m. UTC | #3
Andy Chiu <andy.chiu@sifive.com> writes:

> This patchset is implemented based on vector 1.0 spec to add vector support
> in riscv Linux kernel. There are some assumptions for this implementations.
>
> 1. We assume all harts has the same ISA in the system.
> 2. We disable vector in both kernel and user space [1] by default. Only
>    enable an user's vector after an illegal instruction trap where it
>    actually starts executing vector (the first-use trap [2]).
> 3. We detect "riscv,isa" to determine whether vector is support or not.
>
> We defined a new structure __riscv_v_ext_state in struct thread_struct to
> save/restore the vector related registers. It is used for both kernel space
> and user space.
>  - In kernel space, the datap pointer in __riscv_v_ext_state will be
>    allocated to save vector registers.
>  - In user space,
> 	- In signal handler of user space, the structure is placed
> 	  right after __riscv_ctx_hdr, which is embedded in fp reserved
> 	  aera. This is required to avoid ABI break [2]. And datap points
> 	  to the end of __riscv_v_ext_state.
> 	- In ptrace, the data will be put in ubuf in which we use
> 	  riscv_vr_get()/riscv_vr_set() to get or set the
> 	  __riscv_v_ext_state data structure from/to it, datap pointer
> 	  would be zeroed and vector registers will be copied to the
> 	  address right after the __riscv_v_ext_state structure in ubuf.
>
> This patchset is rebased to v6.3-rc1 and it is tested by running several
> vector programs simultaneously. It delivers signals correctly in a test
> where we can see a valid ucontext_t in a signal handler, and a correct V
> context returing back from it. And the ptrace interface is tested by
> PTRACE_{GET,SET}REGSET. Lastly, KVM is tested by running above tests in
> a guest using the same kernel image. All tests are done on an rv64gcv
> virt QEMU.
>
> Note: please apply the patch at [4] due to a regression introduced by
> commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> optimizations") before testing the series.
>
> Source tree:
> https://github.com/sifive/riscv-linux/tree/riscv/for-next/vector-v18

After some offlist discussions, we might have a identified a
potential libc->application ABI break.

Given an application that does custom task scheduling via a signal
handler. The application binary is not vector aware, but libc is. Libc
is using vector registers for memcpy. It's an "old application, new
library, new kernel"-scenario.

 | ...
 | struct context *p1_ctx;
 | struct context *p2_ctx;
 | 
 | void sighandler(int sig, siginfo_t *info, void *ucontext)
 | {
 |   if (p1_running)
 |     switch_to(p1_ctx, p2_ctx);
 |   if (p2_running)
 |     switch_to(p2_ctx, p1_ctx);
 | }
 | 
 | void p1(void)
 | {
 |   memcpy(foo, bar, 17);
 | }
 | 
 | void p2(void)
 | {
 |   ...
 | }
 | ...

The switch_to() function schedules p1() and p2(). E.g., the
application (assumes that it) saves the complete task state from
sigcontext (ucontext) to p1_ctx, and restores sigcontext to p2_ctx, so
when sigreturn is called, p2() is running, and p1() has been
interrupted.

The "old application" which is not aware of vector, is now run on a
vector enabled kernel/glibc.

Assume that the sighandler is hit, and p1() is in the middle of the
vector memcpy. The switch_to() function will not save the vector
state, and next time p2() is scheduled to run it will have incorrect
machine state.

Now:

Is this an actual or theoretical problem (i.e. are there any
applications in the wild)? I'd be surprised if it would not be the
latter...

Regardless, a kernel knob for disabling vector (sysctl/prctl) to avoid
these kind of breaks is needed (right?). Could this knob be a
follow-up patch to the existing v18 series?

Note that arm64 does not suffer from this with SVE, because the default
vector length (vl==0/128b*32) fits in the "legacy" sigcontext.


Björn
Björn Töpel April 19, 2023, 2:54 p.m. UTC | #4
Björn Töpel <bjorn@kernel.org> writes:

> Andy Chiu <andy.chiu@sifive.com> writes:
>
>> This patchset is implemented based on vector 1.0 spec to add vector support
>> in riscv Linux kernel. There are some assumptions for this implementations.
>>
>> 1. We assume all harts has the same ISA in the system.
>> 2. We disable vector in both kernel and user space [1] by default. Only
>>    enable an user's vector after an illegal instruction trap where it
>>    actually starts executing vector (the first-use trap [2]).
>> 3. We detect "riscv,isa" to determine whether vector is support or not.
>>
>> We defined a new structure __riscv_v_ext_state in struct thread_struct to
>> save/restore the vector related registers. It is used for both kernel space
>> and user space.
>>  - In kernel space, the datap pointer in __riscv_v_ext_state will be
>>    allocated to save vector registers.
>>  - In user space,
>> 	- In signal handler of user space, the structure is placed
>> 	  right after __riscv_ctx_hdr, which is embedded in fp reserved
>> 	  aera. This is required to avoid ABI break [2]. And datap points
>> 	  to the end of __riscv_v_ext_state.
>> 	- In ptrace, the data will be put in ubuf in which we use
>> 	  riscv_vr_get()/riscv_vr_set() to get or set the
>> 	  __riscv_v_ext_state data structure from/to it, datap pointer
>> 	  would be zeroed and vector registers will be copied to the
>> 	  address right after the __riscv_v_ext_state structure in ubuf.
>>
>> This patchset is rebased to v6.3-rc1 and it is tested by running several
>> vector programs simultaneously. It delivers signals correctly in a test
>> where we can see a valid ucontext_t in a signal handler, and a correct V
>> context returing back from it. And the ptrace interface is tested by
>> PTRACE_{GET,SET}REGSET. Lastly, KVM is tested by running above tests in
>> a guest using the same kernel image. All tests are done on an rv64gcv
>> virt QEMU.
>>
>> Note: please apply the patch at [4] due to a regression introduced by
>> commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
>> optimizations") before testing the series.
>>
>> Source tree:
>> https://github.com/sifive/riscv-linux/tree/riscv/for-next/vector-v18
>
> After some offlist discussions, we might have a identified a
> potential libc->application ABI break.
>
> Given an application that does custom task scheduling via a signal
> handler. The application binary is not vector aware, but libc is. Libc
> is using vector registers for memcpy. It's an "old application, new
> library, new kernel"-scenario.
>
>  | ...
>  | struct context *p1_ctx;
>  | struct context *p2_ctx;
>  | 
>  | void sighandler(int sig, siginfo_t *info, void *ucontext)
>  | {
>  |   if (p1_running)
>  |     switch_to(p1_ctx, p2_ctx);
>  |   if (p2_running)
>  |     switch_to(p2_ctx, p1_ctx);
>  | }
>  | 
>  | void p1(void)
>  | {
>  |   memcpy(foo, bar, 17);
>  | }
>  | 
>  | void p2(void)
>  | {
>  |   ...
>  | }
>  | ...
>
> The switch_to() function schedules p1() and p2(). E.g., the
> application (assumes that it) saves the complete task state from
> sigcontext (ucontext) to p1_ctx, and restores sigcontext to p2_ctx, so
> when sigreturn is called, p2() is running, and p1() has been
> interrupted.
>
> The "old application" which is not aware of vector, is now run on a
> vector enabled kernel/glibc.
>
> Assume that the sighandler is hit, and p1() is in the middle of the
> vector memcpy. The switch_to() function will not save the vector
> state, and next time p2() is scheduled to run it will have incorrect
> machine state.
>
> Now:
>
> Is this an actual or theoretical problem (i.e. are there any
> applications in the wild)? I'd be surprised if it would not be the
> latter...
>
> Regardless, a kernel knob for disabling vector (sysctl/prctl) to avoid
> these kind of breaks is needed (right?). Could this knob be a
> follow-up patch to the existing v18 series?
>
> Note that arm64 does not suffer from this with SVE, because the default
> vector length (vl==0/128b*32) fits in the "legacy" sigcontext.

Andy, to clarify from the patchwork call; In
Documentation/arm64/sve.rst:

There's a per-process prctl (section 6), and a system runtime conf
(section 9).


Björn
Palmer Dabbelt April 19, 2023, 3:18 p.m. UTC | #5
On Wed, 19 Apr 2023 07:54:23 PDT (-0700), bjorn@kernel.org wrote:
> Björn Töpel <bjorn@kernel.org> writes:
>
>> Andy Chiu <andy.chiu@sifive.com> writes:
>>
>>> This patchset is implemented based on vector 1.0 spec to add vector support
>>> in riscv Linux kernel. There are some assumptions for this implementations.
>>>
>>> 1. We assume all harts has the same ISA in the system.
>>> 2. We disable vector in both kernel and user space [1] by default. Only
>>>    enable an user's vector after an illegal instruction trap where it
>>>    actually starts executing vector (the first-use trap [2]).
>>> 3. We detect "riscv,isa" to determine whether vector is support or not.
>>>
>>> We defined a new structure __riscv_v_ext_state in struct thread_struct to
>>> save/restore the vector related registers. It is used for both kernel space
>>> and user space.
>>>  - In kernel space, the datap pointer in __riscv_v_ext_state will be
>>>    allocated to save vector registers.
>>>  - In user space,
>>> 	- In signal handler of user space, the structure is placed
>>> 	  right after __riscv_ctx_hdr, which is embedded in fp reserved
>>> 	  aera. This is required to avoid ABI break [2]. And datap points
>>> 	  to the end of __riscv_v_ext_state.
>>> 	- In ptrace, the data will be put in ubuf in which we use
>>> 	  riscv_vr_get()/riscv_vr_set() to get or set the
>>> 	  __riscv_v_ext_state data structure from/to it, datap pointer
>>> 	  would be zeroed and vector registers will be copied to the
>>> 	  address right after the __riscv_v_ext_state structure in ubuf.
>>>
>>> This patchset is rebased to v6.3-rc1 and it is tested by running several
>>> vector programs simultaneously. It delivers signals correctly in a test
>>> where we can see a valid ucontext_t in a signal handler, and a correct V
>>> context returing back from it. And the ptrace interface is tested by
>>> PTRACE_{GET,SET}REGSET. Lastly, KVM is tested by running above tests in
>>> a guest using the same kernel image. All tests are done on an rv64gcv
>>> virt QEMU.
>>>
>>> Note: please apply the patch at [4] due to a regression introduced by
>>> commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
>>> optimizations") before testing the series.
>>>
>>> Source tree:
>>> https://github.com/sifive/riscv-linux/tree/riscv/for-next/vector-v18
>>
>> After some offlist discussions, we might have a identified a
>> potential libc->application ABI break.
>>
>> Given an application that does custom task scheduling via a signal
>> handler. The application binary is not vector aware, but libc is. Libc
>> is using vector registers for memcpy. It's an "old application, new
>> library, new kernel"-scenario.
>>
>>  | ...
>>  | struct context *p1_ctx;
>>  | struct context *p2_ctx;
>>  | 
>>  | void sighandler(int sig, siginfo_t *info, void *ucontext)
>>  | {
>>  |   if (p1_running)
>>  |     switch_to(p1_ctx, p2_ctx);
>>  |   if (p2_running)
>>  |     switch_to(p2_ctx, p1_ctx);
>>  | }
>>  | 
>>  | void p1(void)
>>  | {
>>  |   memcpy(foo, bar, 17);
>>  | }
>>  | 
>>  | void p2(void)
>>  | {
>>  |   ...
>>  | }
>>  | ...
>>
>> The switch_to() function schedules p1() and p2(). E.g., the
>> application (assumes that it) saves the complete task state from
>> sigcontext (ucontext) to p1_ctx, and restores sigcontext to p2_ctx, so
>> when sigreturn is called, p2() is running, and p1() has been
>> interrupted.
>>
>> The "old application" which is not aware of vector, is now run on a
>> vector enabled kernel/glibc.
>>
>> Assume that the sighandler is hit, and p1() is in the middle of the
>> vector memcpy. The switch_to() function will not save the vector
>> state, and next time p2() is scheduled to run it will have incorrect
>> machine state.

Thanks for writing this up, and sorry I've dropped the ball a few times on
describing it.

>> Now:
>>
>> Is this an actual or theoretical problem (i.e. are there any
>> applications in the wild)? I'd be surprised if it would not be the
>> latter...

I also have no idea.  It's kind of odd to say "nobody cares about the 
ABI break" when we can manifest it with some fairly simple example, but 
I'd bet that nobody cares.

>> Regardless, a kernel knob for disabling vector (sysctl/prctl) to avoid
>> these kind of breaks is needed (right?). Could this knob be a
>> follow-up patch to the existing v18 series?
>>
>> Note that arm64 does not suffer from this with SVE, because the default
>> vector length (vl==0/128b*32) fits in the "legacy" sigcontext.
>
> Andy, to clarify from the patchwork call; In
> Documentation/arm64/sve.rst:
>
> There's a per-process prctl (section 6), and a system runtime conf
> (section 9).

I think if we want to play it safe WRT the ABI break, then we can 
essentially just do the same thing.  It'll be a much bigger cliff for us 
because we have no space for the V extension, but that was just a 
mistake and there's nothing we can do about it.

> Björn
Andy Chiu April 20, 2023, 4:36 p.m. UTC | #6
On Wed, Apr 19, 2023 at 11:18 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Wed, 19 Apr 2023 07:54:23 PDT (-0700), bjorn@kernel.org wrote:
> > Björn Töpel <bjorn@kernel.org> writes:
> >
> >> Andy Chiu <andy.chiu@sifive.com> writes:
> >>
> >>> This patchset is implemented based on vector 1.0 spec to add vector support
> >>> in riscv Linux kernel. There are some assumptions for this implementations.
> >>>
> >>> 1. We assume all harts has the same ISA in the system.
> >>> 2. We disable vector in both kernel and user space [1] by default. Only
> >>>    enable an user's vector after an illegal instruction trap where it
> >>>    actually starts executing vector (the first-use trap [2]).
> >>> 3. We detect "riscv,isa" to determine whether vector is support or not.
> >>>
> >>> We defined a new structure __riscv_v_ext_state in struct thread_struct to
> >>> save/restore the vector related registers. It is used for both kernel space
> >>> and user space.
> >>>  - In kernel space, the datap pointer in __riscv_v_ext_state will be
> >>>    allocated to save vector registers.
> >>>  - In user space,
> >>>     - In signal handler of user space, the structure is placed
> >>>       right after __riscv_ctx_hdr, which is embedded in fp reserved
> >>>       aera. This is required to avoid ABI break [2]. And datap points
> >>>       to the end of __riscv_v_ext_state.
> >>>     - In ptrace, the data will be put in ubuf in which we use
> >>>       riscv_vr_get()/riscv_vr_set() to get or set the
> >>>       __riscv_v_ext_state data structure from/to it, datap pointer
> >>>       would be zeroed and vector registers will be copied to the
> >>>       address right after the __riscv_v_ext_state structure in ubuf.
> >>>
> >>> This patchset is rebased to v6.3-rc1 and it is tested by running several
> >>> vector programs simultaneously. It delivers signals correctly in a test
> >>> where we can see a valid ucontext_t in a signal handler, and a correct V
> >>> context returing back from it. And the ptrace interface is tested by
> >>> PTRACE_{GET,SET}REGSET. Lastly, KVM is tested by running above tests in
> >>> a guest using the same kernel image. All tests are done on an rv64gcv
> >>> virt QEMU.
> >>>
> >>> Note: please apply the patch at [4] due to a regression introduced by
> >>> commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> >>> optimizations") before testing the series.
> >>>
> >>> Source tree:
> >>> https://github.com/sifive/riscv-linux/tree/riscv/for-next/vector-v18
> >>
> >> After some offlist discussions, we might have a identified a
> >> potential libc->application ABI break.
> >>
> >> Given an application that does custom task scheduling via a signal
> >> handler. The application binary is not vector aware, but libc is. Libc
> >> is using vector registers for memcpy. It's an "old application, new
> >> library, new kernel"-scenario.
> >>
> >>  | ...
> >>  | struct context *p1_ctx;
> >>  | struct context *p2_ctx;
> >>  |
> >>  | void sighandler(int sig, siginfo_t *info, void *ucontext)
> >>  | {
> >>  |   if (p1_running)
> >>  |     switch_to(p1_ctx, p2_ctx);
> >>  |   if (p2_running)
> >>  |     switch_to(p2_ctx, p1_ctx);
> >>  | }
> >>  |
> >>  | void p1(void)
> >>  | {
> >>  |   memcpy(foo, bar, 17);
> >>  | }
> >>  |
> >>  | void p2(void)
> >>  | {
> >>  |   ...
> >>  | }
> >>  | ...
> >>
> >> The switch_to() function schedules p1() and p2(). E.g., the
> >> application (assumes that it) saves the complete task state from
> >> sigcontext (ucontext) to p1_ctx, and restores sigcontext to p2_ctx, so
> >> when sigreturn is called, p2() is running, and p1() has been
> >> interrupted.
> >>
> >> The "old application" which is not aware of vector, is now run on a
> >> vector enabled kernel/glibc.
> >>
> >> Assume that the sighandler is hit, and p1() is in the middle of the
> >> vector memcpy. The switch_to() function will not save the vector
> >> state, and next time p2() is scheduled to run it will have incorrect
> >> machine state.
>
> Thanks for writing this up, and sorry I've dropped the ball a few times on
> describing it.
>
> >> Now:
> >>
> >> Is this an actual or theoretical problem (i.e. are there any
> >> applications in the wild)? I'd be surprised if it would not be the
> >> latter...
>
> I also have no idea.  It's kind of odd to say "nobody cares about the
> ABI break" when we can manifest it with some fairly simple example, but
> I'd bet that nobody cares.
>
> >> Regardless, a kernel knob for disabling vector (sysctl/prctl) to avoid
> >> these kind of breaks is needed (right?). Could this knob be a
> >> follow-up patch to the existing v18 series?
> >>
> >> Note that arm64 does not suffer from this with SVE, because the default
> >> vector length (vl==0/128b*32) fits in the "legacy" sigcontext.
> >
> > Andy, to clarify from the patchwork call; In
> > Documentation/arm64/sve.rst:
> >
> > There's a per-process prctl (section 6), and a system runtime conf
> > (section 9).

Thanks for pointing me out!

>
> I think if we want to play it safe WRT the ABI break, then we can
> essentially just do the same thing.  It'll be a much bigger cliff for us
> because we have no space for the V extension, but that was just a
> mistake and there's nothing we can do about it.

I understand the concern. It is good to provide a way to have explicit
controls of Vector rather than do nothing if such ABI break happens.
As for implementation details, do you think a system-wide  sysctl
alone is enough? Or, do we also need a prctl for per-process control?

>
> > Björn
Palmer Dabbelt April 26, 2023, 2:27 p.m. UTC | #7
On Thu, 20 Apr 2023 09:36:48 PDT (-0700), andy.chiu@sifive.com wrote:
> On Wed, Apr 19, 2023 at 11:18 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>>
>> On Wed, 19 Apr 2023 07:54:23 PDT (-0700), bjorn@kernel.org wrote:
>> > Björn Töpel <bjorn@kernel.org> writes:
>> >
>> >> Andy Chiu <andy.chiu@sifive.com> writes:
>> >>
>> >>> This patchset is implemented based on vector 1.0 spec to add vector support
>> >>> in riscv Linux kernel. There are some assumptions for this implementations.
>> >>>
>> >>> 1. We assume all harts has the same ISA in the system.
>> >>> 2. We disable vector in both kernel and user space [1] by default. Only
>> >>>    enable an user's vector after an illegal instruction trap where it
>> >>>    actually starts executing vector (the first-use trap [2]).
>> >>> 3. We detect "riscv,isa" to determine whether vector is support or not.
>> >>>
>> >>> We defined a new structure __riscv_v_ext_state in struct thread_struct to
>> >>> save/restore the vector related registers. It is used for both kernel space
>> >>> and user space.
>> >>>  - In kernel space, the datap pointer in __riscv_v_ext_state will be
>> >>>    allocated to save vector registers.
>> >>>  - In user space,
>> >>>     - In signal handler of user space, the structure is placed
>> >>>       right after __riscv_ctx_hdr, which is embedded in fp reserved
>> >>>       aera. This is required to avoid ABI break [2]. And datap points
>> >>>       to the end of __riscv_v_ext_state.
>> >>>     - In ptrace, the data will be put in ubuf in which we use
>> >>>       riscv_vr_get()/riscv_vr_set() to get or set the
>> >>>       __riscv_v_ext_state data structure from/to it, datap pointer
>> >>>       would be zeroed and vector registers will be copied to the
>> >>>       address right after the __riscv_v_ext_state structure in ubuf.
>> >>>
>> >>> This patchset is rebased to v6.3-rc1 and it is tested by running several
>> >>> vector programs simultaneously. It delivers signals correctly in a test
>> >>> where we can see a valid ucontext_t in a signal handler, and a correct V
>> >>> context returing back from it. And the ptrace interface is tested by
>> >>> PTRACE_{GET,SET}REGSET. Lastly, KVM is tested by running above tests in
>> >>> a guest using the same kernel image. All tests are done on an rv64gcv
>> >>> virt QEMU.
>> >>>
>> >>> Note: please apply the patch at [4] due to a regression introduced by
>> >>> commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
>> >>> optimizations") before testing the series.
>> >>>
>> >>> Source tree:
>> >>> https://github.com/sifive/riscv-linux/tree/riscv/for-next/vector-v18
>> >>
>> >> After some offlist discussions, we might have a identified a
>> >> potential libc->application ABI break.
>> >>
>> >> Given an application that does custom task scheduling via a signal
>> >> handler. The application binary is not vector aware, but libc is. Libc
>> >> is using vector registers for memcpy. It's an "old application, new
>> >> library, new kernel"-scenario.
>> >>
>> >>  | ...
>> >>  | struct context *p1_ctx;
>> >>  | struct context *p2_ctx;
>> >>  |
>> >>  | void sighandler(int sig, siginfo_t *info, void *ucontext)
>> >>  | {
>> >>  |   if (p1_running)
>> >>  |     switch_to(p1_ctx, p2_ctx);
>> >>  |   if (p2_running)
>> >>  |     switch_to(p2_ctx, p1_ctx);
>> >>  | }
>> >>  |
>> >>  | void p1(void)
>> >>  | {
>> >>  |   memcpy(foo, bar, 17);
>> >>  | }
>> >>  |
>> >>  | void p2(void)
>> >>  | {
>> >>  |   ...
>> >>  | }
>> >>  | ...
>> >>
>> >> The switch_to() function schedules p1() and p2(). E.g., the
>> >> application (assumes that it) saves the complete task state from
>> >> sigcontext (ucontext) to p1_ctx, and restores sigcontext to p2_ctx, so
>> >> when sigreturn is called, p2() is running, and p1() has been
>> >> interrupted.
>> >>
>> >> The "old application" which is not aware of vector, is now run on a
>> >> vector enabled kernel/glibc.
>> >>
>> >> Assume that the sighandler is hit, and p1() is in the middle of the
>> >> vector memcpy. The switch_to() function will not save the vector
>> >> state, and next time p2() is scheduled to run it will have incorrect
>> >> machine state.
>>
>> Thanks for writing this up, and sorry I've dropped the ball a few times on
>> describing it.
>>
>> >> Now:
>> >>
>> >> Is this an actual or theoretical problem (i.e. are there any
>> >> applications in the wild)? I'd be surprised if it would not be the
>> >> latter...
>>
>> I also have no idea.  It's kind of odd to say "nobody cares about the
>> ABI break" when we can manifest it with some fairly simple example, but
>> I'd bet that nobody cares.
>>
>> >> Regardless, a kernel knob for disabling vector (sysctl/prctl) to avoid
>> >> these kind of breaks is needed (right?). Could this knob be a
>> >> follow-up patch to the existing v18 series?
>> >>
>> >> Note that arm64 does not suffer from this with SVE, because the default
>> >> vector length (vl==0/128b*32) fits in the "legacy" sigcontext.
>> >
>> > Andy, to clarify from the patchwork call; In
>> > Documentation/arm64/sve.rst:
>> >
>> > There's a per-process prctl (section 6), and a system runtime conf
>> > (section 9).
>
> Thanks for pointing me out!
>
>>
>> I think if we want to play it safe WRT the ABI break, then we can
>> essentially just do the same thing.  It'll be a much bigger cliff for us
>> because we have no space for the V extension, but that was just a
>> mistake and there's nothing we can do about it.
>
> I understand the concern. It is good to provide a way to have explicit
> controls of Vector rather than do nothing if such ABI break happens.
> As for implementation details, do you think a system-wide  sysctl
> alone is enough? Or, do we also need a prctl for per-process control?

A few of us were talking in the patchwork meeting.  It's kind of a grey 
area here, but we're just going to play it safe and wait for the 
prctl and sys interfaces to show up before merging this.

I know it's a pain to have to wait another release, but there's still no 
publicly availiable V hardware yet so waiting isn't concretely impacting 
users right now.  If we flip on V now we probably won't get a ton of 
testing, so we risk the ABI break sticking around for a few release 
which would be a huge headache.

Andy said he'd be able to do the prtcl and sys interfaces pretty 
quickly, so hopfully everything's lined up for the next release.

>> > Björn