[PULL,53/96] target/ppc: Move VMX integer add/sub saturate insns to decodetree.

Message ID	20240725235410.451624-54-npiggin@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org> From: Nicholas Piggin <npiggin@gmail.com> To: qemu-devel@nongnu.org Cc: Nicholas Piggin <npiggin@gmail.com>, qemu-ppc@nongnu.org, Chinmay Rath <rathc@linux.ibm.com>, Richard Henderson <richard.henderson@linaro.org> Subject: [PULL 53/96] target/ppc: Move VMX integer add/sub saturate insns to decodetree. Date: Fri, 26 Jul 2024 09:53:26 +1000 Message-ID: <20240725235410.451624-54-npiggin@gmail.com> In-Reply-To: <20240725235410.451624-1-npiggin@gmail.com> References: <20240725235410.451624-1-npiggin@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2607:f8b0:4864:20::102b; envelope-from=npiggin@gmail.com; helo=mail-pj1-x102b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Series	[PULL,01/96] tests/tcg: Skip failing ppc64 multi-threaded tests \| expand [PULL,01/96] tests/tcg: Skip failing ppc64 multi-threaded tests [PULL,02/96] spapr: Migrate ail-mode-3 spapr cap [PULL,03/96] spapr: Free stdout path [PULL,04/96] ppc/vof: Fix unaligned FDT property access [PULL,05/96] accel/kvm: Introduce kvm_create_and_park_vcpu() helper [PULL,06/96] cpu-common.c: export cpu_get_free_index to be reused later [PULL,07/96] target/ppc: handle vcpu hotplug failure gracefully [PULL,08/96] target/ppc/arch_dump: set prstatus pid to cpuid [PULL,09/96] linux-header: PPC: KVM: Update one-reg ids for DEXCR, HASHKEYR and HASHPKEYR [PULL,10/96] target/ppc/cpu_init: Synchronize DEXCR with KVM for migration [PULL,11/96] target/ppc/cpu_init: Synchronize HASHKEYR with KVM for migration [PULL,12/96] target/ppc/cpu_init: Synchronize HASHPKEYR with KVM for migration [PULL,13/96] ppc/pnv: Update Power10's cfam id to use Power10 DD2 [PULL,14/96] ppc/pnv: Fix loss of LPC SERIRQ interrupts [PULL,15/96] ppc/pnv: Implement POWER9 LPC PSI serirq outputs and auto-clear function [PULL,16/96] ppc/pnv: Begin a more complete ADU LPC model for POWER9/10 [PULL,17/96] ppc/pnv: Implement ADU access to LPC space [PULL,18/96] target/ppc: Fix msgsnd for POWER8 [PULL,19/96] ppc/pnv: Add pointer from PnvCPUState to PnvCore [PULL,20/96] ppc/pnv: Move timebase state into PnvCore [PULL,21/96] target/ppc: Move SPR indirect registers into PnvCore [PULL,22/96] ppc/pnv: use class attribute to limit SMT threads for different machines [PULL,23/96] ppc/pnv: Extend chip_pir class method to TIR as well [PULL,24/96] ppc: Add a core_index to CPUPPCState for SMT vCPUs [PULL,25/96] target/ppc: Add helpers to check for SMT sibling threads [PULL,26/96] ppc: Add has_smt_siblings property to CPUPPCState [PULL,27/96] ppc/pnv: Add a big-core mode that joins two regular cores [PULL,28/96] ppc/pnv: Add allow for big-core differences in DT generation [PULL,29/96] ppc/pnv: Implement big-core PVR for Power9/10 [PULL,30/96] ppc/pnv: Implement Power9 CPU core thread state indirect register [PULL,31/96] ppc/pnv: Add POWER10 ChipTOD quirk for big-core [PULL,32/96] ppc/pnv: Add big-core machine property [PULL,33/96] ppc/pnv: Add a CPU nmi and resume function [PULL,34/96] ppc/pnv: Implement POWER10 PC xscom registers for direct controls [PULL,35/96] ppc/pnv: Add an LPAR per core machine option [PULL,36/96] ppc/pnv: Remove ppc target dependency from pnv_xscom.h [PULL,37/96] hw/ssi: Add SPI model [PULL,38/96] hw/ssi: Extend SPI model [PULL,39/96] hw/block: Add Microchip's 25CSM04 to m25p80 [PULL,40/96] hw/ppc: SPI controller wiring to P10 chip [PULL,41/96] tests/qtest: Add pnv-spi-seeprom qtest [PULL,42/96] pnv/xive2: XIVE2 Cache Watch, Cache Flush and Sync Injection support [PULL,43/96] pnv/xive2: Structure/define alignment changes [PULL,44/96] pnv/xive: Support cache flush and queue sync inject with notifications [PULL,45/96] pnv/xive2: Add NVG and NVC to cache watch facility [PULL,46/96] pnv/xive2: Configure Virtualization Structure Tables through the PC [PULL,47/96] pnv/xive2: Enable VST NVG and NVC index compression [PULL,48/96] pnv/xive2: Set Translation Table for the NVC port space [PULL,49/96] pnv/xive2: Fail VST entry address computation if table has no VSD [PULL,50/96] pnv/xive2: Move xive2_nvp_pic_print_info() to xive2.c [PULL,51/96] pnv/xive2: Refine TIMA 'info pic' output [PULL,52/96] pnv/xive2: Dump more END state with 'info pic' [PULL,53/96] target/ppc: Move VMX integer add/sub saturate insns to decodetree. [PULL,54/96] target/ppc: Improve VMX integer add/sub saturate instructions. [PULL,55/96] target/ppc: Move ISA300 flag check out of do_helper_XX3. [PULL,56/96] target/ppc: Move VSX arithmetic and max/min insns to decodetree. [PULL,57/96] target/ppc: Move VSX logical instructions to decodetree. [PULL,58/96] target/ppc: Moving VSX scalar storage access insns to decodetree. [PULL,59/96] target/ppc: Move VSX vector with length storage access insns to decodetree. [PULL,60/96] target/ppc: Move VSX vector storage access insns to decodetree. [PULL,61/96] target/ppc: Move VSX fp compare insns to decodetree. [PULL,62/96] target/ppc: Move get/set_avr64 functions to vmx-impl.c.inc. [PULL,63/96] target/ppc: Update VMX storage access insns to use tcg_gen_qemu_ld/st_i128. [PULL,64/96] target/ppc : Update VSX storage access insns to use tcg_gen_qemu _ld/st_i128. [PULL,65/96] target/ppc: Reorganise and rename ppc_hash32_pp_prot() [PULL,66/96] target/ppc/mmu_common.c: Remove local name for a constant [PULL,67/96] target/ppc/mmu_common.c: Remove single use local variable [PULL,68/96] target/ppc/mmu_common.c: Remove single use local variable [PULL,69/96] target/ppc/mmu_common.c: Remove another single use local variable [PULL,70/96] target/ppc/mmu_common.c: Remove yet another single use local variable [PULL,71/96] target/ppc/mmu_common.c: Return directly in ppc6xx_tlb_pte_check() [PULL,72/96] target/ppc/mmu_common.c: Simplify ppc6xx_tlb_pte_check() [PULL,73/96] target/ppc/mmu_common.c: Remove unused field from mmu_ctx_t [PULL,74/96] target/ppc/mmu_common.c: Remove hash field from mmu_ctx_t [PULL,75/96] target/ppc/mmu_common.c: Remove pte_update_flags() [PULL,76/96] target/ppc/mmu_common.c: Remove nx field from mmu_ctx_t [PULL,77/96] target/ppc/mmu_common.c: Convert local variable to bool [PULL,78/96] target/ppc/mmu_common.c: Remove single use local variable [PULL,79/96] target/ppc/mmu_common.c: Simplify a switch statement [PULL,80/96] target/ppc/mmu_common.c: Inline and remove ppc6xx_tlb_pte_check() [PULL,81/96] target/ppc/mmu_common.c: Remove ptem field from mmu_ctx_t [PULL,82/96] target/ppc: Add function to get protection key for hash32 MMU [PULL,83/96] target/ppc/mmu-hash32.c: Inline and remove ppc_hash32_pte_prot() [PULL,84/96] target/ppc/mmu_common.c: Init variable in function that relies on it [PULL,85/96] target/ppc/mmu_common.c: Remove key field from mmu_ctx_t [PULL,86/96] target/ppc/mmu_common.c: Stop using ctx in ppc6xx_tlb_check() [PULL,87/96] target/ppc/mmu_common.c: Rename function parameter [PULL,88/96] target/ppc/mmu_common.c: Use defines instead of numeric constants [PULL,89/96] target/ppc: Remove bat_size_prot() [PULL,90/96] target/ppc/mmu_common.c: Stop using ctx in get_bat_6xx_tlb() [PULL,91/96] target/ppc/mmu_common.c: Remove mmu_ctx_t [PULL,92/96] target/ppc/mmu-hash32.c: Inline and remove ppc_hash32_pte_raddr() [PULL,93/96] target/ppc/mmu-hash32.c: Move get_pteg_offset32() to the header [PULL,94/96] target/ppc: Unexport some functions from mmu-book3s-v3.h [PULL,95/96] target/ppc/mmu-radix64: Remove externally unused parts from header [PULL,96/96] target/ppc: Remove includes from mmu-book3s-v3.h

diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 4fa089cbf9..dd7526bc2e 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -203,18 +203,18 @@ DEF_HELPER_FLAGS_3(vsro, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_3(vsrv, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_3(vslv, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_3(VPRTYBQ, TCG_CALL_NO_RWG, void, avr, avr, i32) -DEF_HELPER_FLAGS_5(vaddsbs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) -DEF_HELPER_FLAGS_5(vaddshs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) -DEF_HELPER_FLAGS_5(vaddsws, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) -DEF_HELPER_FLAGS_5(vsubsbs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) -DEF_HELPER_FLAGS_5(vsubshs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) -DEF_HELPER_FLAGS_5(vsubsws, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) -DEF_HELPER_FLAGS_5(vaddubs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) -DEF_HELPER_FLAGS_5(vadduhs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) -DEF_HELPER_FLAGS_5(vadduws, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) -DEF_HELPER_FLAGS_5(vsububs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) -DEF_HELPER_FLAGS_5(vsubuhs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) -DEF_HELPER_FLAGS_5(vsubuws, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) +DEF_HELPER_FLAGS_5(VADDSBS, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) +DEF_HELPER_FLAGS_5(VADDSHS, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) +DEF_HELPER_FLAGS_5(VADDSWS, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) +DEF_HELPER_FLAGS_5(VSUBSBS, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) +DEF_HELPER_FLAGS_5(VSUBSHS, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) +DEF_HELPER_FLAGS_5(VSUBSWS, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) +DEF_HELPER_FLAGS_5(VADDUBS, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) +DEF_HELPER_FLAGS_5(VADDUHS, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) +DEF_HELPER_FLAGS_5(VADDUWS, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) +DEF_HELPER_FLAGS_5(VSUBUBS, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) +DEF_HELPER_FLAGS_5(VSUBUHS, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) +DEF_HELPER_FLAGS_5(VSUBUWS, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) DEF_HELPER_FLAGS_3(VADDUQM, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_4(VADDECUQ, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) DEF_HELPER_FLAGS_4(VADDEUQM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index ee33141476..8988fca60e 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -832,6 +832,14 @@ VADDCUW 000100 ..... ..... ..... 00110000000 @VX VADDCUQ 000100 ..... ..... ..... 00101000000 @VX VADDUQM 000100 ..... ..... ..... 00100000000 @VX +VADDSBS 000100 ..... ..... ..... 01100000000 @VX +VADDSHS 000100 ..... ..... ..... 01101000000 @VX +VADDSWS 000100 ..... ..... ..... 01110000000 @VX + +VADDUBS 000100 ..... ..... ..... 01000000000 @VX +VADDUHS 000100 ..... ..... ..... 01001000000 @VX +VADDUWS 000100 ..... ..... ..... 01010000000 @VX + VADDEUQM 000100 ..... ..... ..... ..... 111100 @VA VADDECUQ 000100 ..... ..... ..... ..... 111101 @VA @@ -839,6 +847,14 @@ VSUBCUW 000100 ..... ..... ..... 10110000000 @VX VSUBCUQ 000100 ..... ..... ..... 10101000000 @VX VSUBUQM 000100 ..... ..... ..... 10100000000 @VX +VSUBSBS 000100 ..... ..... ..... 11100000000 @VX +VSUBSHS 000100 ..... ..... ..... 11101000000 @VX +VSUBSWS 000100 ..... ..... ..... 11110000000 @VX + +VSUBUBS 000100 ..... ..... ..... 11000000000 @VX +VSUBUHS 000100 ..... ..... ..... 11001000000 @VX +VSUBUWS 000100 ..... ..... ..... 11010000000 @VX + VSUBECUQ 000100 ..... ..... ..... ..... 111111 @VA VSUBEUQM 000100 ..... ..... ..... ..... 111110 @VA diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 2c6b633d65..ef4b2e75d6 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -541,7 +541,7 @@ VARITHFPFMA(nmsubfp, float_muladd_negate_result | float_muladd_negate_c); } #define VARITHSAT_DO(name, op, optype, cvt, element) \ - void helper_v##name(ppc_avr_t *r, ppc_avr_t *vscr_sat, \ + void helper_V##name(ppc_avr_t *r, ppc_avr_t *vscr_sat, \ ppc_avr_t *a, ppc_avr_t *b, uint32_t desc) \ { \ int sat = 0; \ @@ -555,17 +555,17 @@ VARITHFPFMA(nmsubfp, float_muladd_negate_result | float_muladd_negate_c); } \ } #define VARITHSAT_SIGNED(suffix, element, optype, cvt) \ - VARITHSAT_DO(adds##suffix##s, +, optype, cvt, element) \ - VARITHSAT_DO(subs##suffix##s, -, optype, cvt, element) + VARITHSAT_DO(ADDS##suffix##S, +, optype, cvt, element) \ + VARITHSAT_DO(SUBS##suffix##S, -, optype, cvt, element) #define VARITHSAT_UNSIGNED(suffix, element, optype, cvt) \ - VARITHSAT_DO(addu##suffix##s, +, optype, cvt, element) \ - VARITHSAT_DO(subu##suffix##s, -, optype, cvt, element) -VARITHSAT_SIGNED(b, s8, int16_t, cvtshsb) -VARITHSAT_SIGNED(h, s16, int32_t, cvtswsh) -VARITHSAT_SIGNED(w, s32, int64_t, cvtsdsw) -VARITHSAT_UNSIGNED(b, u8, uint16_t, cvtshub) -VARITHSAT_UNSIGNED(h, u16, uint32_t, cvtswuh) -VARITHSAT_UNSIGNED(w, u32, uint64_t, cvtsduw) + VARITHSAT_DO(ADDU##suffix##S, +, optype, cvt, element) \ + VARITHSAT_DO(SUBU##suffix##S, -, optype, cvt, element) +VARITHSAT_SIGNED(B, s8, int16_t, cvtshsb) +VARITHSAT_SIGNED(H, s16, int32_t, cvtswsh) +VARITHSAT_SIGNED(W, s32, int64_t, cvtsdsw) +VARITHSAT_UNSIGNED(B, u8, uint16_t, cvtshub) +VARITHSAT_UNSIGNED(H, u16, uint32_t, cvtswuh) +VARITHSAT_UNSIGNED(W, u32, uint64_t, cvtsduw) #undef VARITHSAT_CASE #undef VARITHSAT_DO #undef VARITHSAT_SIGNED diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc index 8084af75cc..fdb283c1d4 100644 --- a/target/ppc/translate/vmx-impl.c.inc +++ b/target/ppc/translate/vmx-impl.c.inc @@ -1047,58 +1047,6 @@ TRANS(VRLQ, do_vector_rotl_quad, false, false) TRANS(VRLQNM, do_vector_rotl_quad, true, false) TRANS(VRLQMI, do_vector_rotl_quad, false, true) -#define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3) \ -static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \ - TCGv_vec sat, TCGv_vec a, \ - TCGv_vec b) \ -{ \ - TCGv_vec x = tcg_temp_new_vec_matching(t); \ - glue(glue(tcg_gen_, NORM), _vec)(VECE, x, a, b); \ - glue(glue(tcg_gen_, SAT), _vec)(VECE, t, a, b); \ - tcg_gen_cmp_vec(TCG_COND_NE, VECE, x, x, t); \ - tcg_gen_or_vec(VECE, sat, sat, x); \ -} \ -static void glue(gen_, NAME)(DisasContext *ctx) \ -{ \ - static const TCGOpcode vecop_list[] = { \ - glue(glue(INDEX_op_, NORM), _vec), \ - glue(glue(INDEX_op_, SAT), _vec), \ - INDEX_op_cmp_vec, 0 \ - }; \ - static const GVecGen4 g = { \ - .fniv = glue(glue(gen_, NAME), _vec), \ - .fno = glue(gen_helper_, NAME), \ - .opt_opc = vecop_list, \ - .write_aofs = true, \ - .vece = VECE, \ - }; \ - if (unlikely(!ctx->altivec_enabled)) { \ - gen_exception(ctx, POWERPC_EXCP_VPU); \ - return; \ - } \ - tcg_gen_gvec_4(avr_full_offset(rD(ctx->opcode)), \ - offsetof(CPUPPCState, vscr_sat), \ - avr_full_offset(rA(ctx->opcode)), \ - avr_full_offset(rB(ctx->opcode)), \ - 16, 16, &g); \ -} - -GEN_VXFORM_SAT(vaddubs, MO_8, add, usadd, 0, 8); -GEN_VXFORM_DUAL_EXT(vaddubs, PPC_ALTIVEC, PPC_NONE, 0, \ - vmul10uq, PPC_NONE, PPC2_ISA300, 0x0000F800) -GEN_VXFORM_SAT(vadduhs, MO_16, add, usadd, 0, 9); -GEN_VXFORM_DUAL(vadduhs, PPC_ALTIVEC, PPC_NONE, \ - vmul10euq, PPC_NONE, PPC2_ISA300) -GEN_VXFORM_SAT(vadduws, MO_32, add, usadd, 0, 10); -GEN_VXFORM_SAT(vaddsbs, MO_8, add, ssadd, 0, 12); -GEN_VXFORM_SAT(vaddshs, MO_16, add, ssadd, 0, 13); -GEN_VXFORM_SAT(vaddsws, MO_32, add, ssadd, 0, 14); -GEN_VXFORM_SAT(vsububs, MO_8, sub, ussub, 0, 24); -GEN_VXFORM_SAT(vsubuhs, MO_16, sub, ussub, 0, 25); -GEN_VXFORM_SAT(vsubuws, MO_32, sub, ussub, 0, 26); -GEN_VXFORM_SAT(vsubsbs, MO_8, sub, sssub, 0, 28); -GEN_VXFORM_SAT(vsubshs, MO_16, sub, sssub, 0, 29); -GEN_VXFORM_SAT(vsubsws, MO_32, sub, sssub, 0, 30); GEN_VXFORM_TRANS(vsl, 2, 7); GEN_VXFORM_TRANS(vsr, 2, 11); GEN_VXFORM_ENV(vpkuhum, 7, 0); @@ -2641,26 +2589,14 @@ static void gen_xpnd04_2(DisasContext *ctx) } } - -GEN_VXFORM_DUAL(vsubsws, PPC_ALTIVEC, PPC_NONE, \ - xpnd04_2, PPC_NONE, PPC2_ISA300) - GEN_VXFORM_DUAL(vsububm, PPC_ALTIVEC, PPC_NONE, \ bcdadd, PPC_NONE, PPC2_ALTIVEC_207) -GEN_VXFORM_DUAL(vsububs, PPC_ALTIVEC, PPC_NONE, \ - bcdadd, PPC_NONE, PPC2_ALTIVEC_207) GEN_VXFORM_DUAL(vsubuhm, PPC_ALTIVEC, PPC_NONE, \ bcdsub, PPC_NONE, PPC2_ALTIVEC_207) -GEN_VXFORM_DUAL(vsubuhs, PPC_ALTIVEC, PPC_NONE, \ - bcdsub, PPC_NONE, PPC2_ALTIVEC_207) -GEN_VXFORM_DUAL(vaddshs, PPC_ALTIVEC, PPC_NONE, \ - bcdcpsgn, PPC_NONE, PPC2_ISA300) GEN_VXFORM_DUAL(vsubudm, PPC2_ALTIVEC_207, PPC_NONE, \ bcds, PPC_NONE, PPC2_ISA300) GEN_VXFORM_DUAL(vsubuwm, PPC_ALTIVEC, PPC_NONE, \ bcdus, PPC_NONE, PPC2_ISA300) -GEN_VXFORM_DUAL(vsubsbs, PPC_ALTIVEC, PPC_NONE, \ - bcdtrunc, PPC_NONE, PPC2_ISA300) static void gen_vsbox(DisasContext *ctx) { @@ -2937,6 +2873,180 @@ static bool do_vx_vaddsubcuw(DisasContext *ctx, arg_VX *a, int add) TRANS(VSUBCUW, do_vx_vaddsubcuw, 0) TRANS(VADDCUW, do_vx_vaddsubcuw, 1) +/* Integer Add/Sub Saturate Instructions */ +static inline void do_vadd_vsub_sat +( + unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b, + void (*norm_op)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec), + void (*sat_op)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec)) +{ + TCGv_vec x = tcg_temp_new_vec_matching(t); + norm_op(vece, x, a, b); + sat_op(vece, t, a, b); + tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); + tcg_gen_or_vec(vece, sat, sat, x); +} + +static void gen_vadd_sat_u(unsigned vece, TCGv_vec t, TCGv_vec sat, + TCGv_vec a, TCGv_vec b) +{ + do_vadd_vsub_sat(vece, t, sat, a, b, tcg_gen_add_vec, tcg_gen_usadd_vec); +} + +static void gen_vadd_sat_s(unsigned vece, TCGv_vec t, TCGv_vec sat, + TCGv_vec a, TCGv_vec b) +{ + do_vadd_vsub_sat(vece, t, sat, a, b, tcg_gen_add_vec, tcg_gen_ssadd_vec); +} + +static void gen_vsub_sat_u(unsigned vece, TCGv_vec t, TCGv_vec sat, + TCGv_vec a, TCGv_vec b) +{ + do_vadd_vsub_sat(vece, t, sat, a, b, tcg_gen_sub_vec, tcg_gen_ussub_vec); +} + +static void gen_vsub_sat_s(unsigned vece, TCGv_vec t, TCGv_vec sat, + TCGv_vec a, TCGv_vec b) +{ + do_vadd_vsub_sat(vece, t, sat, a, b, tcg_gen_sub_vec, tcg_gen_sssub_vec); +} + +/* + * Signed/Unsigned add/sub helper ops for byte/halfword/word + * GVecGen4 struct variants. + */ +static const TCGOpcode vecop_list_sub_u[] = { + INDEX_op_sub_vec, INDEX_op_ussub_vec, INDEX_op_cmp_vec, 0 +}; +static const TCGOpcode vecop_list_sub_s[] = { + INDEX_op_sub_vec, INDEX_op_sssub_vec, INDEX_op_cmp_vec, 0 +}; +static const TCGOpcode vecop_list_add_u[] = { + INDEX_op_add_vec, INDEX_op_usadd_vec, INDEX_op_cmp_vec, 0 +}; +static const TCGOpcode vecop_list_add_s[] = { + INDEX_op_add_vec, INDEX_op_ssadd_vec, INDEX_op_cmp_vec, 0 +}; + +static const GVecGen4 op_vsububs = { + .fniv = gen_vsub_sat_u, + .fno = gen_helper_VSUBUBS, + .opt_opc = vecop_list_sub_u, + .write_aofs = true, + .vece = MO_8 +}; + +static const GVecGen4 op_vaddubs = { + .fniv = gen_vadd_sat_u, + .fno = gen_helper_VADDUBS, + .opt_opc = vecop_list_add_u, + .write_aofs = true, + .vece = MO_8 +}; + +static const GVecGen4 op_vsubuhs = { + .fniv = gen_vsub_sat_u, + .fno = gen_helper_VSUBUHS, + .opt_opc = vecop_list_sub_u, + .write_aofs = true, + .vece = MO_16 +}; + +static const GVecGen4 op_vadduhs = { + .fniv = gen_vadd_sat_u, + .fno = gen_helper_VADDUHS, + .opt_opc = vecop_list_add_u, + .write_aofs = true, + .vece = MO_16 +}; + +static const GVecGen4 op_vsubuws = { + .fniv = gen_vsub_sat_u, + .fno = gen_helper_VSUBUWS, + .opt_opc = vecop_list_sub_u, + .write_aofs = true, + .vece = MO_32 +}; + +static const GVecGen4 op_vadduws = { + .fniv = gen_vadd_sat_u, + .fno = gen_helper_VADDUWS, + .opt_opc = vecop_list_add_u, + .write_aofs = true, + .vece = MO_32 +}; + +static const GVecGen4 op_vsubsbs = { + .fniv = gen_vsub_sat_s, + .fno = gen_helper_VSUBSBS, + .opt_opc = vecop_list_sub_s, + .write_aofs = true, + .vece = MO_8 +}; + +static const GVecGen4 op_vaddsbs = { + .fniv = gen_vadd_sat_s, + .fno = gen_helper_VADDSBS, + .opt_opc = vecop_list_add_s, + .write_aofs = true, + .vece = MO_8 +}; + +static const GVecGen4 op_vsubshs = { + .fniv = gen_vsub_sat_s, + .fno = gen_helper_VSUBSHS, + .opt_opc = vecop_list_sub_s, + .write_aofs = true, + .vece = MO_16 +}; + +static const GVecGen4 op_vaddshs = { + .fniv = gen_vadd_sat_s, + .fno = gen_helper_VADDSHS, + .opt_opc = vecop_list_add_s, + .write_aofs = true, + .vece = MO_16 +}; + +static const GVecGen4 op_vsubsws = { + .fniv = gen_vsub_sat_s, + .fno = gen_helper_VSUBSWS, + .opt_opc = vecop_list_sub_s, + .write_aofs = true, + .vece = MO_32 +}; + +static const GVecGen4 op_vaddsws = { + .fniv = gen_vadd_sat_s, + .fno = gen_helper_VADDSWS, + .opt_opc = vecop_list_add_s, + .write_aofs = true, + .vece = MO_32 +}; + +static bool do_vx_vadd_vsub_sat(DisasContext *ctx, arg_VX *a, const GVecGen4 *op) +{ + REQUIRE_VECTOR(ctx); + tcg_gen_gvec_4(avr_full_offset(a->vrt), offsetof(CPUPPCState, vscr_sat), + avr_full_offset(a->vra), avr_full_offset(a->vrb), + 16, 16, op); + + return true; +} + +TRANS_FLAGS(ALTIVEC, VSUBUBS, do_vx_vadd_vsub_sat, &op_vsububs) +TRANS_FLAGS(ALTIVEC, VSUBUHS, do_vx_vadd_vsub_sat, &op_vsubuhs) +TRANS_FLAGS(ALTIVEC, VSUBUWS, do_vx_vadd_vsub_sat, &op_vsubuws) +TRANS_FLAGS(ALTIVEC, VSUBSBS, do_vx_vadd_vsub_sat, &op_vsubsbs) +TRANS_FLAGS(ALTIVEC, VSUBSHS, do_vx_vadd_vsub_sat, &op_vsubshs) +TRANS_FLAGS(ALTIVEC, VSUBSWS, do_vx_vadd_vsub_sat, &op_vsubsws) +TRANS_FLAGS(ALTIVEC, VADDUBS, do_vx_vadd_vsub_sat, &op_vaddubs) +TRANS_FLAGS(ALTIVEC, VADDUHS, do_vx_vadd_vsub_sat, &op_vadduhs) +TRANS_FLAGS(ALTIVEC, VADDUWS, do_vx_vadd_vsub_sat, &op_vadduws) +TRANS_FLAGS(ALTIVEC, VADDSBS, do_vx_vadd_vsub_sat, &op_vaddsbs) +TRANS_FLAGS(ALTIVEC, VADDSHS, do_vx_vadd_vsub_sat, &op_vaddshs) +TRANS_FLAGS(ALTIVEC, VADDSWS, do_vx_vadd_vsub_sat, &op_vaddsws) + static bool do_vx_vmuleo(DisasContext *ctx, arg_VX *a, bool even, void (*gen_mul)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64)) { diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc index 7bb11b0549..e28958a126 100644 --- a/target/ppc/translate/vmx-ops.c.inc +++ b/target/ppc/translate/vmx-ops.c.inc @@ -54,18 +54,13 @@ GEN_VXFORM(vsro, 6, 17), GEN_VXFORM(xpnd04_1, 0, 22), GEN_VXFORM_300(bcdsr, 0, 23), GEN_VXFORM_300(bcdsr, 0, 31), -GEN_VXFORM_DUAL(vaddubs, vmul10uq, 0, 8, PPC_ALTIVEC, PPC_NONE), -GEN_VXFORM_DUAL(vadduhs, vmul10euq, 0, 9, PPC_ALTIVEC, PPC_NONE), -GEN_VXFORM(vadduws, 0, 10), -GEN_VXFORM(vaddsbs, 0, 12), -GEN_VXFORM_DUAL(vaddshs, bcdcpsgn, 0, 13, PPC_ALTIVEC, PPC_NONE), -GEN_VXFORM(vaddsws, 0, 14), -GEN_VXFORM_DUAL(vsububs, bcdadd, 0, 24, PPC_ALTIVEC, PPC_NONE), -GEN_VXFORM_DUAL(vsubuhs, bcdsub, 0, 25, PPC_ALTIVEC, PPC_NONE), -GEN_VXFORM(vsubuws, 0, 26), -GEN_VXFORM_DUAL(vsubsbs, bcdtrunc, 0, 28, PPC_ALTIVEC, PPC2_ISA300), -GEN_VXFORM(vsubshs, 0, 29), -GEN_VXFORM_DUAL(vsubsws, xpnd04_2, 0, 30, PPC_ALTIVEC, PPC_NONE), +GEN_VXFORM_300_EXT(vmul10uq, 0, 8, 0x0000F800), +GEN_VXFORM_300(vmul10euq, 0, 9), +GEN_VXFORM_300(bcdcpsgn, 0, 13), +GEN_VXFORM_207(bcdadd, 0, 24), +GEN_VXFORM_207(bcdsub, 0, 25), +GEN_VXFORM_300(bcdtrunc, 0, 28), +GEN_VXFORM_300(xpnd04_2, 0, 30), GEN_VXFORM_300(bcdtrunc, 0, 20), GEN_VXFORM_300(bcdutrunc, 0, 21), GEN_VXFORM(vsl, 2, 7),

[PULL,53/96] target/ppc: Move VMX integer add/sub saturate insns to decodetree.

Commit Message

Patch