mbox series

[00/18] target/i386: make most SSE helpers generic in the vector size

Message ID 20220825221411.35122-1-pbonzini@redhat.com (mailing list archive)
Headers show
Series target/i386: make most SSE helpers generic in the vector size | expand

Message

Paolo Bonzini Aug. 25, 2022, 10:13 p.m. UTC
This is the first half of Paul's series from last April, reorganized
to have no occurrence of YMM_ONLY or SHIFT == 2---meaning it can be
committed without much fuss, even without a plan for the implementation
of AVX decoding.

In most cases this is done by using loops that apply the same code for
all of MMX/SSE/AVX, in some cases AVX needs special-casing for the two
128-bit lanes and then this code is simply missing.  The missing helper
code is just 100 lines though, so this _is_ the lion share of the work
to adapt the existing t helpers.

The full work, with the AVX parts rebased on top of these, is at branch
i386-avx of https://gitlab.com/bonzini/qemu.  The branch passes the
tests that Paul had posted, while this reduced part passes the SSE
version that I have sent earlier today.

Paolo

Based-on: <20220825164827.392942-1-pbonzini@redhat.com>

Paul Brook (18):
  i386: Rework sse_op_table1
  i386: Rework sse_op_table6/7
  i386: Add CHECK_NO_VEX
  i386: Move 3DNOW decoder
  i386: Add ZMM_OFFSET macro
  i386: Rewrite vector shift helper
  i386: Rewrite simple integer vector helpers
  i386: Misc integer AVX helper prep
  i386: Destructive vector helpers for AVX
  i386: Add size suffix to vector FP helpers
  i386: Floating point arithmetic helper AVX prep
  i386: Dot product AVX helper prep
  i386: reimplement AVX comparison helpers
  i386: Destructive FP helpers for AVX
  i386: Misc AVX helper prep
  i386: Rewrite blendv helpers
  i386: AVX pclmulqdq prep
  i386: AVX+AES helpers prep

 target/i386/ops_sse.h        | 1781 ++++++++++++++++++----------------
 target/i386/ops_sse_header.h |   68 +-
 target/i386/tcg/translate.c  |  673 +++++++------
 3 files changed, 1345 insertions(+), 1177 deletions(-)

Comments

Richard Henderson Aug. 25, 2022, 11:32 p.m. UTC | #1
On 8/25/22 15:13, Paolo Bonzini wrote:
> This is the first half of Paul's series from last April, reorganized
> to have no occurrence of YMM_ONLY or SHIFT == 2---meaning it can be
> committed without much fuss, even without a plan for the implementation
> of AVX decoding.
> 
> In most cases this is done by using loops that apply the same code for
> all of MMX/SSE/AVX, in some cases AVX needs special-casing for the two
> 128-bit lanes and then this code is simply missing.  The missing helper
> code is just 100 lines though, so this _is_ the lion share of the work
> to adapt the existing t helpers.

Ok.  I'll note that this is a decent intermediate step for
further conversion to tcg/tcg-op-gvec.h, which has a parameter
for the vector length instead of having N functions with the
length implicit in each name.


r~