mbox series

[v8,00/10] riscv: set vstart_eq_zero on mark_vs_dirty

Message ID 20240308215402.117405-1-dbarboza@ventanamicro.com (mailing list archive)
Headers show
Series riscv: set vstart_eq_zero on mark_vs_dirty | expand

Message

Daniel Henrique Barboza March 8, 2024, 9:53 p.m. UTC
Hi,

In this new version we addressed the points rose by LIU Zhiwei in the v7
version. Some patches had to go, additional patches were added:

- patches 1 and 2 from v7: already queued and sent in the last PR.

- patch 6 from v7: moved to patch 1

- patches 2, 3, 4 and 5 (all new): rework how we update tail elements

  Version 7 has a problem with tail elements being updated regardless of
  vstart >= vl because there's no guard for it. To fix that, and be able
  to remove the the vstart >= vl brconds from the translation code (most
  of them - more on that later), we changed vext_set_tail_elems_1s() to
  be a no-op if vstart >= vl. After that we went through all the code
  that was setting tail elems with vext_set_elems_1s() and converted it
  to use vext_set_tail_elems_1s(). We'll not update tail elements if
  vstart >= vl even without a brcond guard in the translation code.

- patch 6 (new): fix scalar move insns. They weren't setting vstart = 0.

- patch 7 (patch 3 from v7): do not remove brconds from scalar move
  insns

  trans_vmv_s_x() and trans_vfmv_s_f() does not have a helper that will
  handle vstart >= vl for them, so they need their brcond. 

- patches 4 and 5 from v7: dropped. We're not removing all brconds, so
  we can't get rid of cpu_vstart and cpu_vl.

Series based on alistair/riscv-to-apply.next. 

Patches missing review: 2, 3, 4, 5, 6.

Daniel Henrique Barboza (9):
  target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX()
  target/riscv: handle vstart >= vl in vext_set_tail_elems_1s()
  target/riscv/vector_helper.c: do vstart=0 after updating tail
  target/riscv/vector_helper.c: update tail with
    vext_set_tail_elems_1s()
  target/riscv: use vext_set_tail_elems_1s() in vcrypto insns
  trans_rvv.c.inc: set vstart = 0 in int scalar move insns
  target/riscv: remove 'over' brconds from vector trans
  trans_rvv.c.inc: remove redundant mark_vs_dirty() calls
  target/riscv/vector_helper.c: optimize loops in ldst helpers

Ivan Klokov (1):
  target/riscv: Clear vstart_qe_zero flag

 target/riscv/insn_trans/trans_rvbf16.c.inc |  18 +-
 target/riscv/insn_trans/trans_rvv.c.inc    | 205 ++++++---------------
 target/riscv/insn_trans/trans_rvvk.c.inc   |  30 +--
 target/riscv/translate.c                   |   6 +
 target/riscv/vcrypto_helper.c              |  63 +++----
 target/riscv/vector_helper.c               | 168 ++++++-----------
 target/riscv/vector_internals.c            |  28 +++
 target/riscv/vector_internals.h            |   4 +
 8 files changed, 186 insertions(+), 336 deletions(-)

Comments

Richard Henderson March 9, 2024, 8:14 p.m. UTC | #1
On 3/8/24 11:53, Daniel Henrique Barboza wrote:
> - patch 7 (patch 3 from v7): do not remove brconds from scalar move
>    insns
> 
>    trans_vmv_s_x() and trans_vfmv_s_f() does not have a helper that will
>    handle vstart >= vl for them, so they need their brcond.
> 
> - patches 4 and 5 from v7: dropped. We're not removing all brconds, so
>    we can't get rid of cpu_vstart and cpu_vl.

Not important for the bug fix, but for future cleanup:

(1) Use movcond to for these moves instead of brcond.

(2) Use explicit load/store in these two places where cpu_vstart and cpu_vl are still in use.

(3) Now there are no tcg globals that are modified by the vector helpers, which means that 
they can all be marked TCG_CALL_NO_WG, and any that never exit via exception may be marked 
TCG_CALL_NO_RWG.  This may reduce register allocator spill/fill across the affected helpers.


r~
Daniel Henrique Barboza March 9, 2024, 8:24 p.m. UTC | #2
On 3/9/24 17:14, Richard Henderson wrote:
> On 3/8/24 11:53, Daniel Henrique Barboza wrote:
>> - patch 7 (patch 3 from v7): do not remove brconds from scalar move
>>    insns
>>
>>    trans_vmv_s_x() and trans_vfmv_s_f() does not have a helper that will
>>    handle vstart >= vl for them, so they need their brcond.
>>
>> - patches 4 and 5 from v7: dropped. We're not removing all brconds, so
>>    we can't get rid of cpu_vstart and cpu_vl.
> 
> Not important for the bug fix, but for future cleanup:
> 
> (1) Use movcond to for these moves instead of brcond.
> 
> (2) Use explicit load/store in these two places where cpu_vstart and cpu_vl are still in use.
> 
> (3) Now there are no tcg globals that are modified by the vector helpers, which means that they can all be marked TCG_CALL_NO_WG, and any that never exit via exception may be marked TCG_CALL_NO_RWG.  This may reduce register allocator spill/fill across the affected helpers.

Nice. I'll mark it as a TODO to get it done after we get this series merged.


Thanks,

Daniel

> 
> 
> r~