From patchwork Tue Apr 26 12:50:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lucas Mateus Martins Araujo e Castro X-Patchwork-Id: 12827074 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 30675C433EF for ; Tue, 26 Apr 2022 13:03:07 +0000 (UTC) Received: from localhost ([::1]:55136 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKqQ-0006KH-9C for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 09:03:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45336) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njKeV-00041H-GI; Tue, 26 Apr 2022 08:50:47 -0400 Received: from [187.72.171.209] (port=32914 helo=outlook.eldorado.org.br) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKeT-00005X-J2; Tue, 26 Apr 2022 08:50:46 -0400 Received: from p9ibm ([10.10.71.235]) by outlook.eldorado.org.br over TLS secured channel with Microsoft SMTPSVC(8.5.9600.16384); Tue, 26 Apr 2022 09:50:37 -0300 Received: from eldorado.org.br (unknown [10.10.70.45]) by p9ibm (Postfix) with ESMTP id B1EB8801008; Tue, 26 Apr 2022 09:50:36 -0300 (-03) From: "Lucas Mateus Castro(alqotel)" To: qemu-ppc@nongnu.org Subject: [RFC PATCH 1/7] target/ppc: Implement xxm[tf]acc and xxsetaccz Date: Tue, 26 Apr 2022 09:50:22 -0300 Message-Id: <20220426125028.18844-2-lucas.araujo@eldorado.org.br> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> References: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> MIME-Version: 1.0 X-OriginalArrivalTime: 26 Apr 2022 12:50:37.0021 (UTC) FILETIME=[39E700D0:01D8596C] X-Host-Lookup-Failed: Reverse DNS lookup failed for 187.72.171.209 (failed) Received-SPF: pass client-ip=187.72.171.209; envelope-from=lucas.araujo@eldorado.org.br; helo=outlook.eldorado.org.br X-Spam_score_int: -4 X-Spam_score: -0.5 X-Spam_bar: / X-Spam_report: (-0.5 / 5.0 requ) BAYES_00=-1.9, PDS_HP_HELO_NORDNS=0.659, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Henrique Barboza , richard.henderson@linaro.org, Greg Kurz , "open list:All patches CC here" , "Lucas Mateus Castro \(alqotel\)" , =?utf-8?q?C=C3=A9dric_Le_Goater?= , David Gibson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: "Lucas Mateus Castro (alqotel)" Implement the following PowerISA v3.1 instructions: xxmfacc: VSX Move From Accumulator xxmtacc: VSX Move To Accumulator xxsetaccz: VSX Set Accumulator to Zero The PowerISA 3.1 mentions that for the current version of the architecture, "the hardware implementation provides the effect of ACC[i] and VSRs 4*i to 4*i + 3 logically containing the same data" and "The Accumulators introduce no new logical state at this time" (page 501). For now it seems unnecessary to create new structures, so this patch just uses ACC[i] as VSRs 4*i to 4*i+3 and therefore move to and from accumulators are no-ops. Signed-off-by: Lucas Mateus Castro (alqotel) Reviewed-by: Richard Henderson --- target/ppc/insn32.decode | 9 ++++++++ target/ppc/translate/vsx-impl.c.inc | 36 +++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index 39372fe673..7a76bedfa6 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -151,6 +151,9 @@ &X_vrt_frbp vrt frbp @X_vrt_frbp ...... vrt:5 ..... ....0 .......... . &X_vrt_frbp frbp=%x_frbp +&X_a ra +@X_a ...... ra:3 .. ..... ..... .......... . &X_a + %xx_xt 0:1 21:5 %xx_xb 1:1 11:5 %xx_xa 2:1 16:5 @@ -710,3 +713,9 @@ XVTLSBB 111100 ... -- 00010 ..... 111011011 . - @XX2_bf_xb &XL_s s:uint8_t @XL_s ......-------------- s:1 .......... - &XL_s RFEBB 010011-------------- . 0010010010 - @XL_s + +## Accumulator Instructions + +XXMFACC 011111 ... -- 00000 ----- 0010110001 - @X_a +XXMTACC 011111 ... -- 00001 ----- 0010110001 - @X_a +XXSETACCZ 011111 ... -- 00011 ----- 0010110001 - @X_a diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index 3692740736..919b889c40 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -2787,6 +2787,42 @@ static bool trans_XVCVBF16SPN(DisasContext *ctx, arg_XX2 *a) return true; } + /* + * The PowerISA 3.1 mentions that for the current version of the + * architecture, "the hardware implementation provides the effect of + * ACC[i] and VSRs 4*i to 4*i + 3 logically containing the same data" + * and "The Accumulators introduce no new logical state at this time" + * (page 501). For now it seems unnecessary to create new structures, + * so this patch just uses ACC[i] as VSRs 4*i to 4*i+3 and therefore + * move to and from accumulators are no-ops. + */ +static bool trans_XXMFACC(DisasContext *ctx, arg_X_a *a) +{ + REQUIRE_INSNS_FLAGS2(ctx, ISA310); + REQUIRE_VSX(ctx); + return true; +} + +static bool trans_XXMTACC(DisasContext *ctx, arg_X_a *a) +{ + REQUIRE_INSNS_FLAGS2(ctx, ISA310); + REQUIRE_VSX(ctx); + return true; +} + +static bool trans_XXSETACCZ(DisasContext *ctx, arg_X_a *a) +{ + REQUIRE_INSNS_FLAGS2(ctx, ISA310); + REQUIRE_VSX(ctx); + int i; + TCGv_i64 zero = tcg_constant_i64(0); + for (i = 0; i < 4; i++) { + set_cpu_vsr(a->ra * 4 + i, zero, false); + set_cpu_vsr(a->ra * 4 + i, zero, true); + } + return true; +} + #undef GEN_XX2FORM #undef GEN_XX3FORM #undef GEN_XX2IFORM From patchwork Tue Apr 26 12:50:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lucas Mateus Martins Araujo e Castro X-Patchwork-Id: 12827109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 69695C433F5 for ; Tue, 26 Apr 2022 13:06:29 +0000 (UTC) Received: from localhost ([::1]:36302 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKtg-0004Rf-Bq for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 09:06:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45362) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njKeY-0004Ay-Ab; Tue, 26 Apr 2022 08:50:50 -0400 Received: from [187.72.171.209] (port=32914 helo=outlook.eldorado.org.br) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKeW-00005X-7T; Tue, 26 Apr 2022 08:50:50 -0400 Received: from p9ibm ([10.10.71.235]) by outlook.eldorado.org.br over TLS secured channel with Microsoft SMTPSVC(8.5.9600.16384); Tue, 26 Apr 2022 09:50:37 -0300 Received: from eldorado.org.br (unknown [10.10.70.45]) by p9ibm (Postfix) with ESMTP id E74C180009B; Tue, 26 Apr 2022 09:50:36 -0300 (-03) From: "Lucas Mateus Castro(alqotel)" To: qemu-ppc@nongnu.org Subject: [RFC PATCH 2/7] target/ppc: Implemented xvi*ger* instructions Date: Tue, 26 Apr 2022 09:50:23 -0300 Message-Id: <20220426125028.18844-3-lucas.araujo@eldorado.org.br> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> References: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> MIME-Version: 1.0 X-OriginalArrivalTime: 26 Apr 2022 12:50:37.0240 (UTC) FILETIME=[3A086B80:01D8596C] X-Host-Lookup-Failed: Reverse DNS lookup failed for 187.72.171.209 (failed) Received-SPF: pass client-ip=187.72.171.209; envelope-from=lucas.araujo@eldorado.org.br; helo=outlook.eldorado.org.br X-Spam_score_int: -4 X-Spam_score: -0.5 X-Spam_bar: / X-Spam_report: (-0.5 / 5.0 requ) BAYES_00=-1.9, PDS_HP_HELO_NORDNS=0.659, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Henrique Barboza , richard.henderson@linaro.org, Greg Kurz , "open list:All patches CC here" , "Lucas Mateus Castro \(alqotel\)" , =?utf-8?q?C=C3=A9dric_Le_Goater?= , David Gibson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: "Lucas Mateus Castro (alqotel)" Implement the following PowerISA v3.1 instructions: xvi4ger8: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) xvi4ger8pp: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) Positive multiply, Positive accumulate xvi8ger4: VSX Vector 4-bit Signed Integer GER (rank-8 update) xvi8ger4pp: VSX Vector 4-bit Signed Integer GER (rank-8 update) Positive multiply, Positive accumulate xvi8ger4spp: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with Saturate Positive multiply, Positive accumulate xvi16ger2: VSX Vector 16-bit Signed Integer GER (rank-2 update) xvi16ger2pp: VSX Vector 16-bit Signed Integer GER (rank-2 update) Positive multiply, Positive accumulate xvi16ger2s: VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation xvi16ger2spp: VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) --- target/ppc/cpu.h | 5 ++ target/ppc/helper.h | 3 + target/ppc/insn32.decode | 15 +++++ target/ppc/int_helper.c | 85 +++++++++++++++++++++++++++++ target/ppc/internal.h | 28 ++++++++++ target/ppc/translate/vsx-impl.c.inc | 50 +++++++++++++++++ 6 files changed, 186 insertions(+) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index c2b6c987c0..ee55c6cfa2 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -2688,6 +2688,11 @@ static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i) return (uint64_t *)((uintptr_t)env + vsr64_offset(i, false)); } +static inline ppc_vsr_t *cpu_vsr_ptr(CPUPPCState *env, int i) +{ + return (ppc_vsr_t *)((uintptr_t)env + vsr_full_offset(i)); +} + static inline long avr64_offset(int i, bool high) { return vsr64_offset(i + 32, high); diff --git a/target/ppc/helper.h b/target/ppc/helper.h index aa6773c4a5..06553517de 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -537,6 +537,9 @@ DEF_HELPER_5(XXBLENDVB, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_5(XXBLENDVH, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_5(XXBLENDVW, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_5(XXBLENDVD, void, vsr, vsr, vsr, vsr, i32) +DEF_HELPER_6(XVI4GER8, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(XVI8GER4, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(XVI16GER2, void, env, i32, i32, i32, i32, i32) DEF_HELPER_2(efscfsi, i32, env, i32) DEF_HELPER_2(efscfui, i32, env, i32) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index 7a76bedfa6..653f50db93 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -170,6 +170,9 @@ &XX3 xt xa xb @XX3 ...... ..... ..... ..... ........ ... &XX3 xt=%xx_xt xa=%xx_xa xb=%xx_xb +%xx_at 23:3 !function=times_4 +@XX3_at ...... ... .. ..... ..... ........ ... &XX3 xt=%xx_at xb=%xx_xb + &XX3_dm xt xa xb dm @XX3_dm ...... ..... ..... ..... . dm:2 ..... ... &XX3_dm xt=%xx_xt xa=%xx_xa xb=%xx_xb @@ -719,3 +722,15 @@ RFEBB 010011-------------- . 0010010010 - @XL_s XXMFACC 011111 ... -- 00000 ----- 0010110001 - @X_a XXMTACC 011111 ... -- 00001 ----- 0010110001 - @X_a XXSETACCZ 011111 ... -- 00011 ----- 0010110001 - @X_a + +## Vector GER instruction + +XVI4GER8 111011 ... -- ..... ..... 00100011 ..- @XX3_at xa=%xx_xa +XVI4GER8PP 111011 ... -- ..... ..... 00100010 ..- @XX3_at xa=%xx_xa +XVI8GER4 111011 ... -- ..... ..... 00000011 ..- @XX3_at xa=%xx_xa +XVI8GER4PP 111011 ... -- ..... ..... 00000010 ..- @XX3_at xa=%xx_xa +XVI16GER2 111011 ... -- ..... ..... 01001011 ..- @XX3_at xa=%xx_xa +XVI16GER2PP 111011 ... -- ..... ..... 01101011 ..- @XX3_at xa=%xx_xa +XVI8GER4SPP 111011 ... -- ..... ..... 01100011 ..- @XX3_at xa=%xx_xa +XVI16GER2S 111011 ... -- ..... ..... 00101011 ..- @XX3_at xa=%xx_xa +XVI16GER2SPP 111011 ... -- ..... ..... 00101010 ..- @XX3_at xa=%xx_xa diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 8c1674510b..bd2f1a7c2a 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -782,6 +782,91 @@ VCT(uxs, cvtsduw, u32) VCT(sxs, cvtsdsw, s32) #undef VCT +/* + * Packed VSX Integer GER Flags + * 00 - no accumulation no saturation + * 01 - accumulate but no saturation + * 10 - no accumulation but with saturation + * 11 - accumulate with saturation + */ +static inline bool get_sat(uint32_t flags) +{ + return flags & 0x2; +} + +static inline bool get_acc(uint32_t flags) +{ + return flags & 0x1; +} + +#define GET_VsrN(a, i) (extract32(a->VsrB((i) / 2), (i) % 2 ? 4 : 0, 4)) +#define GET_VsrB(a, i) a->VsrB(i) +#define GET_VsrH(a, i) a->VsrH(i) + +#define GET_VsrSN(a, i) (sextract32(a->VsrSB((i) / 2), (i) % 2 ? 4 : 0, 4)) +#define GET_VsrSB(a, i) a->VsrSB(i) +#define GET_VsrSH(a, i) a->VsrSH(i) + +#define XVIGER(NAME, RANK, EL) \ + void NAME(CPUPPCState *env, uint32_t a_r, uint32_t b_r, \ + uint32_t at_r, uint32_t mask, uint32_t packed_flags) \ + { \ + ppc_vsr_t *a = cpu_vsr_ptr(env, a_r), *b = cpu_vsr_ptr(env, b_r), *at; \ + bool sat = get_sat(packed_flags), acc = get_acc(packed_flags); \ + uint8_t pmsk = ger_get_pmsk(mask), xmsk = ger_get_xmsk(mask), \ + ymsk = ger_get_ymsk(mask); \ + uint8_t pmsk_bit, xmsk_bit, ymsk_bit; \ + int64_t psum; \ + int32_t va, vb; \ + int i, j, k; \ + for (i = 0, xmsk_bit = 1 << 3; i < 4; i++, xmsk_bit >>= 1) { \ + at = cpu_vsr_ptr(env, at_r + i); \ + for (j = 0, ymsk_bit = 1 << 3; j < 4; j++, ymsk_bit >>= 1) { \ + if ((xmsk_bit & xmsk) && (ymsk_bit & ymsk)) { \ + psum = 0; \ + for (k = 0, pmsk_bit = 1 << (RANK - 1); k < RANK; \ + k++, pmsk_bit >>= 1) { \ + if (pmsk_bit & pmsk) { \ + va = (int32_t)GET_VsrS##EL(a, RANK * i + k); \ + vb = (int32_t) ((RANK == 4) ? \ + GET_Vsr##EL(b, RANK * j + k) : \ + GET_VsrS##EL(b, RANK * j + k));\ + psum += va * vb; \ + } \ + } \ + if (acc) { \ + psum += at->VsrSW(j); \ + } \ + if (sat && psum > INT32_MAX) { \ + set_vscr_sat(env); \ + at->VsrSW(j) = INT32_MAX; \ + } else if (sat && psum < INT32_MIN) { \ + set_vscr_sat(env); \ + at->VsrSW(j) = INT32_MIN; \ + } else { \ + at->VsrSW(j) = (int32_t) psum; \ + } \ + } else { \ + at->VsrSW(j) = 0; \ + } \ + } \ + } \ + } + +XVIGER(helper_XVI4GER8, 8, N) +XVIGER(helper_XVI8GER4, 4, B) +XVIGER(helper_XVI16GER2, 2, H) + +#undef GER_MULT +#undef XVIGER_NAME +#undef XVIGER +#undef GET_VsrN +#undef GET_VsrB +#undef GET_VsrH +#undef GET_VsrSN +#undef GET_VsrSB +#undef GET_VsrSH + target_ulong helper_vclzlsbb(ppc_avr_t *r) { target_ulong count = 0; diff --git a/target/ppc/internal.h b/target/ppc/internal.h index 8094e0b033..a994d98238 100644 --- a/target/ppc/internal.h +++ b/target/ppc/internal.h @@ -291,4 +291,32 @@ G_NORETURN void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr addr, uintptr_t retaddr); #endif +/* + * Auxiliary functions to pack/unpack masks for GER instructions. + * + * Packed format: + * Bits 0-3: xmsk + * Bits 4-7: ymsk + * Bits 8-15: pmsk + */ +static inline uint8_t ger_get_xmsk(uint32_t packed_masks) +{ + return packed_masks & 0xF; +} + +static inline uint8_t ger_get_ymsk(uint32_t packed_masks) +{ + return (packed_masks >> 4) & 0xF; +} + +static inline uint8_t ger_get_pmsk(uint32_t packed_masks) +{ + return (packed_masks >> 8) & 0xFF; +} + +static inline int ger_pack_masks(int pmsk, int ymsk, int xmsk) +{ + return (pmsk & 0xFF) << 8 | (ymsk & 0xF) << 4 | (xmsk & 0xF); +} + #endif /* PPC_INTERNAL_H */ diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index 919b889c40..1eb68c7081 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -2823,6 +2823,56 @@ static bool trans_XXSETACCZ(DisasContext *ctx, arg_X_a *a) return true; } +/* + * Packed VSX Integer GER Flags + * 00 - no accumulation no saturation + * 01 - accumulate but no saturation + * 10 - no accumulation but with saturation + * 11 - accumulate with saturation + */ +static uint32_t pack_flags_xvi(int acc, int sat) +{ + return (sat << 1) | acc; +} + +static bool do_ger_XX3(DisasContext *ctx, arg_XX3 *a, uint32_t op, + void (*helper)(TCGv_env, TCGv_i32, TCGv_i32, + TCGv_i32, TCGv_i32, TCGv_i32)) +{ + uint32_t mask; + REQUIRE_INSNS_FLAGS2(ctx, ISA310); + REQUIRE_VSX(ctx); + if (unlikely((a->xa / 4 == a->xt / 4) || (a->xb / 4 == a->xt / 4))) { + gen_invalid(ctx); + return true; + } + + mask = 0xFFFFFFFF; + helper(cpu_env, tcg_constant_i32(a->xa), tcg_constant_i32(a->xb), + tcg_constant_i32(a->xt), tcg_constant_i32(mask), + tcg_constant_i32(op)); + return true; +} + +/* Used to keep line length < 80 */ +#define GER_NOP pack_flags_xvi(0, 0) +#define GER_PP pack_flags_xvi(1, 0) +#define GER_SAT pack_flags_xvi(0, 1) +#define GER_SPP pack_flags_xvi(1, 1) +TRANS(XVI4GER8, do_ger_XX3, GER_NOP, gen_helper_XVI4GER8) +TRANS(XVI4GER8PP, do_ger_XX3, GER_PP, gen_helper_XVI4GER8) +TRANS(XVI8GER4, do_ger_XX3, GER_NOP, gen_helper_XVI8GER4) +TRANS(XVI8GER4PP, do_ger_XX3, GER_PP, gen_helper_XVI8GER4) +TRANS(XVI8GER4SPP, do_ger_XX3, GER_SPP, gen_helper_XVI8GER4) +TRANS(XVI16GER2, do_ger_XX3, GER_NOP, gen_helper_XVI16GER2) +TRANS(XVI16GER2PP, do_ger_XX3, GER_PP, gen_helper_XVI16GER2) +TRANS(XVI16GER2S, do_ger_XX3, GER_SAT, gen_helper_XVI16GER2) +TRANS(XVI16GER2SPP, do_ger_XX3, GER_SPP, gen_helper_XVI16GER2) +#undef GER_NOP +#undef GER_PP +#undef GER_SAT +#undef GER_SPP + #undef GEN_XX2FORM #undef GEN_XX3FORM #undef GEN_XX2IFORM From patchwork Tue Apr 26 12:50:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lucas Mateus Martins Araujo e Castro X-Patchwork-Id: 12827053 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CEF7FC433F5 for ; Tue, 26 Apr 2022 12:56:32 +0000 (UTC) Received: from localhost ([::1]:43302 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKk3-0006SZ-Pb for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 08:56:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45390) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njKeb-0004Hm-0T; Tue, 26 Apr 2022 08:50:53 -0400 Received: from [187.72.171.209] (port=32914 helo=outlook.eldorado.org.br) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKeZ-00005X-9y; Tue, 26 Apr 2022 08:50:52 -0400 Received: from p9ibm ([10.10.71.235]) by outlook.eldorado.org.br over TLS secured channel with Microsoft SMTPSVC(8.5.9600.16384); Tue, 26 Apr 2022 09:50:37 -0300 Received: from eldorado.org.br (unknown [10.10.70.45]) by p9ibm (Postfix) with ESMTP id 2215C801008; Tue, 26 Apr 2022 09:50:37 -0300 (-03) From: "Lucas Mateus Castro(alqotel)" To: qemu-ppc@nongnu.org Subject: [RFC PATCH 3/7] target/ppc: Implemented pmxvi*ger* instructions Date: Tue, 26 Apr 2022 09:50:24 -0300 Message-Id: <20220426125028.18844-4-lucas.araujo@eldorado.org.br> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> References: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> MIME-Version: 1.0 X-OriginalArrivalTime: 26 Apr 2022 12:50:37.0364 (UTC) FILETIME=[3A1B5740:01D8596C] X-Host-Lookup-Failed: Reverse DNS lookup failed for 187.72.171.209 (failed) Received-SPF: pass client-ip=187.72.171.209; envelope-from=lucas.araujo@eldorado.org.br; helo=outlook.eldorado.org.br X-Spam_score_int: -4 X-Spam_score: -0.5 X-Spam_bar: / X-Spam_report: (-0.5 / 5.0 requ) BAYES_00=-1.9, PDS_HP_HELO_NORDNS=0.659, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Henrique Barboza , richard.henderson@linaro.org, Greg Kurz , "open list:All patches CC here" , "Lucas Mateus Castro \(alqotel\)" , =?utf-8?q?C=C3=A9dric_Le_Goater?= , David Gibson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: "Lucas Mateus Castro (alqotel)" Implement the following PowerISA v3.1 instructions: pmxvi4ger8: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) pmxvi4ger8pp: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) Positive multiply, Positive accumulate pmxvi8ger4: Prefixed Masked VSX Vector 4-bit Signed Integer GER (rank-8 update) pmxvi8ger4pp: Prefixed Masked VSX Vector 4-bit Signed Integer GER (rank-8 update) Positive multiply, Positive accumulate pmxvi8ger4spp: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with Saturate Positive multiply, Positive accumulate pmxvi16ger2: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) pmxvi16ger2pp: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) Positive multiply, Positive accumulate pmxvi16ger2s: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation pmxvi16ger2spp: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) --- target/ppc/insn64.decode | 30 +++++++++++++++++++++++++++++ target/ppc/translate/vsx-impl.c.inc | 28 +++++++++++++++++++++++++-- 2 files changed, 56 insertions(+), 2 deletions(-) diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode index 691e8fe6c0..18915f1977 100644 --- a/target/ppc/insn64.decode +++ b/target/ppc/insn64.decode @@ -68,6 +68,15 @@ ...... ..... ..... ..... ..... .. .... \ &8RR_XX4_uim3 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb xc=%8rr_xx_xc +# Format MMIRR:XX3 +&MMIRR_XX3 xa xb xt pmsk xmsk ymsk +%xx3_xa 2:1 16:5 +%xx3_xb 1:1 11:5 +%xx3_at 23:3 !function=times_4 +@MMIRR_XX3 ...... .. .... .. . . ........ xmsk:4 ymsk:4 \ + ...... ... .. ..... ..... ........ ... \ + &MMIRR_XX3 xa=%xx3_xa xb=%xx3_xb xt=%xx3_at + ### Fixed-Point Load Instructions PLBZ 000001 10 0--.-- .................. \ @@ -115,6 +124,27 @@ PSTFS 000001 10 0--.-- .................. \ PSTFD 000001 10 0--.-- .................. \ 110110 ..... ..... ................ @PLS_D +## Vector GER instruction + +PMXVI4GER8 000001 11 1001 -- - - pmsk:8 ........ \ + 111011 ... -- ..... ..... 00100011 ..- @MMIRR_XX3 +PMXVI4GER8PP 000001 11 1001 -- - - pmsk:8 ........ \ + 111011 ... -- ..... ..... 00100010 ..- @MMIRR_XX3 +PMXVI8GER4 000001 11 1001 -- - - pmsk:4 ---- ........ \ + 111011 ... -- ..... ..... 00000011 ..- @MMIRR_XX3 +PMXVI8GER4PP 000001 11 1001 -- - - pmsk:4 ---- ........ \ + 111011 ... -- ..... ..... 00000010 ..- @MMIRR_XX3 +PMXVI16GER2 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 01001011 ..- @MMIRR_XX3 +PMXVI16GER2PP 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 01101011 ..- @MMIRR_XX3 +PMXVI8GER4SPP 000001 11 1001 -- - - pmsk:4 ---- ........ \ + 111011 ... -- ..... ..... 01100011 ..- @MMIRR_XX3 +PMXVI16GER2S 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 00101011 ..- @MMIRR_XX3 +PMXVI16GER2SPP 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 00101010 ..- @MMIRR_XX3 + ### Prefixed No-operation Instruction @PNOP 000001 11 0000-- 000000000000000000 \ diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index 1eb68c7081..eb7b8cb0c6 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -2835,7 +2835,7 @@ static uint32_t pack_flags_xvi(int acc, int sat) return (sat << 1) | acc; } -static bool do_ger_XX3(DisasContext *ctx, arg_XX3 *a, uint32_t op, +static bool do_ger_MMIRR_XX3(DisasContext *ctx, arg_MMIRR_XX3 *a, uint32_t op, void (*helper)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32)) { @@ -2847,11 +2847,25 @@ static bool do_ger_XX3(DisasContext *ctx, arg_XX3 *a, uint32_t op, return true; } - mask = 0xFFFFFFFF; + mask = ger_pack_masks(a->pmsk, a->ymsk, a->xmsk); helper(cpu_env, tcg_constant_i32(a->xa), tcg_constant_i32(a->xb), tcg_constant_i32(a->xt), tcg_constant_i32(mask), tcg_constant_i32(op)); return true; + +} +static bool do_ger_XX3(DisasContext *ctx, arg_XX3 *a, uint32_t op_flags, + void (*helper)(TCGv_env, TCGv_i32, TCGv_i32, + TCGv_i32, TCGv_i32, TCGv_i32)) +{ + arg_MMIRR_XX3 m; + m.xa = a->xa; + m.xb = a->xb; + m.xt = a->xt; + m.pmsk = 0xFF; + m.ymsk = 0xF; + m.xmsk = 0xF; + return do_ger_MMIRR_XX3(ctx, &m, op_flags, helper); } /* Used to keep line length < 80 */ @@ -2868,6 +2882,16 @@ TRANS(XVI16GER2, do_ger_XX3, GER_NOP, gen_helper_XVI16GER2) TRANS(XVI16GER2PP, do_ger_XX3, GER_PP, gen_helper_XVI16GER2) TRANS(XVI16GER2S, do_ger_XX3, GER_SAT, gen_helper_XVI16GER2) TRANS(XVI16GER2SPP, do_ger_XX3, GER_SPP, gen_helper_XVI16GER2) + +TRANS64(PMXVI4GER8, do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVI4GER8) +TRANS64(PMXVI4GER8PP, do_ger_MMIRR_XX3, GER_PP, gen_helper_XVI4GER8) +TRANS64(PMXVI8GER4, do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVI8GER4) +TRANS64(PMXVI8GER4PP, do_ger_MMIRR_XX3, GER_PP, gen_helper_XVI8GER4) +TRANS64(PMXVI8GER4SPP, do_ger_MMIRR_XX3, GER_SPP, gen_helper_XVI8GER4) +TRANS64(PMXVI16GER2, do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVI16GER2) +TRANS64(PMXVI16GER2PP, do_ger_MMIRR_XX3, GER_PP, gen_helper_XVI16GER2) +TRANS64(PMXVI16GER2S, do_ger_MMIRR_XX3, GER_SAT, gen_helper_XVI16GER2) +TRANS64(PMXVI16GER2SPP, do_ger_MMIRR_XX3, GER_SPP, gen_helper_XVI16GER2) #undef GER_NOP #undef GER_PP #undef GER_SAT From patchwork Tue Apr 26 12:50:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lucas Mateus Martins Araujo e Castro X-Patchwork-Id: 12827108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A4490C433F5 for ; Tue, 26 Apr 2022 13:05:05 +0000 (UTC) Received: from localhost ([::1]:59160 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKsK-0000er-O4 for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 09:05:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45410) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njKee-0004QU-O3; Tue, 26 Apr 2022 08:50:57 -0400 Received: from [187.72.171.209] (port=32914 helo=outlook.eldorado.org.br) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKeb-00005X-Vn; Tue, 26 Apr 2022 08:50:55 -0400 Received: from p9ibm ([10.10.71.235]) by outlook.eldorado.org.br over TLS secured channel with Microsoft SMTPSVC(8.5.9600.16384); Tue, 26 Apr 2022 09:50:37 -0300 Received: from eldorado.org.br (unknown [10.10.70.45]) by p9ibm (Postfix) with ESMTP id 5335580009B; Tue, 26 Apr 2022 09:50:37 -0300 (-03) From: "Lucas Mateus Castro(alqotel)" To: qemu-ppc@nongnu.org Subject: [RFC PATCH 4/7] target/ppc: Implemented xvf*ger* Date: Tue, 26 Apr 2022 09:50:25 -0300 Message-Id: <20220426125028.18844-5-lucas.araujo@eldorado.org.br> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> References: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> MIME-Version: 1.0 X-OriginalArrivalTime: 26 Apr 2022 12:50:37.0584 (UTC) FILETIME=[3A3CE900:01D8596C] X-Host-Lookup-Failed: Reverse DNS lookup failed for 187.72.171.209 (failed) Received-SPF: pass client-ip=187.72.171.209; envelope-from=lucas.araujo@eldorado.org.br; helo=outlook.eldorado.org.br X-Spam_score_int: -4 X-Spam_score: -0.5 X-Spam_bar: / X-Spam_report: (-0.5 / 5.0 requ) BAYES_00=-1.9, PDS_HP_HELO_NORDNS=0.659, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Henrique Barboza , richard.henderson@linaro.org, Greg Kurz , "open list:All patches CC here" , "Lucas Mateus Castro \(alqotel\)" , =?utf-8?q?C=C3=A9dric_Le_Goater?= , David Gibson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: "Lucas Mateus Castro (alqotel)" Implement the following PowerISA v3.1 instructions: xvf32ger: VSX Vector 32-bit Floating-Point GER (rank-1 update) xvf32gernn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate xvf32gernp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate xvf32gerpn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate xvf32gerpp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate xvf64ger: VSX Vector 64-bit Floating-Point GER (rank-1 update) xvf64gernn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate xvf64gernp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate xvf64gerpn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate xvf64gerpp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) --- target/ppc/cpu.h | 4 ++ target/ppc/fpu_helper.c | 64 +++++++++++++++++++++++++++++ target/ppc/helper.h | 2 + target/ppc/insn32.decode | 13 ++++++ target/ppc/translate/vsx-impl.c.inc | 39 ++++++++++++++++++ 5 files changed, 122 insertions(+) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index ee55c6cfa2..b5d7b35dda 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -2652,6 +2652,8 @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx) #define VsrSW(i) s32[i] #define VsrD(i) u64[i] #define VsrSD(i) s64[i] +#define VsrSF(i) f32[i] +#define VsrDF(i) f64[i] #else #define VsrB(i) u8[15 - (i)] #define VsrSB(i) s8[15 - (i)] @@ -2661,6 +2663,8 @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx) #define VsrSW(i) s32[3 - (i)] #define VsrD(i) u64[1 - (i)] #define VsrSD(i) s64[1 - (i)] +#define VsrSF(i) f32[3 - (i)] +#define VsrDF(i) f64[1 - (i)] #endif static inline int vsr64_offset(int i, bool high) diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index 99281cc37a..6b03666d09 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -3462,3 +3462,67 @@ void helper_xssubqp(CPUPPCState *env, uint32_t opcode, *xt = t; do_float_check_status(env, GETPC()); } + +static inline bool ger_acc_flag(uint32_t flag) +{ + return flag & 0x1; +} + +static inline bool ger_neg_mul_flag(uint32_t flag) +{ + return flag & 0x2; +} + +static inline bool ger_neg_acc_flag(uint32_t flag) +{ + return flag & 0x4; +} + +#define VSXGER(NAME, TYPE, EL) \ + void NAME(CPUPPCState *env, uint32_t a_r, uint32_t b_r, \ + uint32_t at_r, uint32_t mask, uint32_t packed_flags) \ + { \ + ppc_vsr_t *a, *b, *at; \ + TYPE aux_acc, va, vb; \ + int i, j, xmsk_bit, ymsk_bit, op_flags; \ + uint8_t xmsk = mask & 0x0F; \ + uint8_t ymsk = (mask >> 4) & 0x0F; \ + int ymax = MIN(4, 128 / (sizeof(TYPE) * 8)); \ + b = cpu_vsr_ptr(env, b_r); \ + float_status *excp_ptr = &env->fp_status; \ + bool acc = ger_acc_flag(packed_flags); \ + bool neg_acc = ger_neg_acc_flag(packed_flags); \ + bool neg_mul = ger_neg_mul_flag(packed_flags); \ + helper_reset_fpstatus(env); \ + for (i = 0, xmsk_bit = 1 << 3; i < 4; i++, xmsk_bit >>= 1) { \ + a = cpu_vsr_ptr(env, a_r + i / ymax); \ + at = cpu_vsr_ptr(env, at_r + i); \ + for (j = 0, ymsk_bit = 1 << (ymax - 1); j < ymax; \ + j++, ymsk_bit >>= 1) { \ + if ((xmsk_bit & xmsk) && (ymsk_bit & ymsk)) { \ + op_flags = (neg_acc ^ neg_mul) ? \ + float_muladd_negate_c : 0; \ + op_flags |= (neg_mul) ? \ + float_muladd_negate_result : 0; \ + va = a->Vsr##EL(i % ymax); \ + vb = b->Vsr##EL(j); \ + aux_acc = at->Vsr##EL(j); \ + if (acc) { \ + at->Vsr##EL(j) = TYPE##_muladd(va, vb, aux_acc, \ + op_flags, \ + excp_ptr); \ + } else { \ + at->Vsr##EL(j) = TYPE##_mul(va, vb, excp_ptr); \ + } \ + } else { \ + at->Vsr##EL(j) = 0; \ + } \ + } \ + } \ + do_float_check_status(env, GETPC()); \ + } + +VSXGER(helper_XVF32GER, float32, SF) +VSXGER(helper_XVF64GER, float64, DF) + +#undef VSXGER diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 06553517de..7d725292b1 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -540,6 +540,8 @@ DEF_HELPER_5(XXBLENDVD, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_6(XVI4GER8, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(XVI8GER4, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(XVI16GER2, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(XVF32GER, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(XVF64GER, void, env, i32, i32, i32, i32, i32) DEF_HELPER_2(efscfsi, i32, env, i32) DEF_HELPER_2(efscfui, i32, env, i32) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index 653f50db93..9652ca286c 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -171,6 +171,7 @@ @XX3 ...... ..... ..... ..... ........ ... &XX3 xt=%xx_xt xa=%xx_xa xb=%xx_xb %xx_at 23:3 !function=times_4 +%xx_xa_pair 2:1 17:4 !function=times_2 @XX3_at ...... ... .. ..... ..... ........ ... &XX3 xt=%xx_at xb=%xx_xb &XX3_dm xt xa xb dm @@ -734,3 +735,15 @@ XVI16GER2PP 111011 ... -- ..... ..... 01101011 ..- @XX3_at xa=%xx_xa XVI8GER4SPP 111011 ... -- ..... ..... 01100011 ..- @XX3_at xa=%xx_xa XVI16GER2S 111011 ... -- ..... ..... 00101011 ..- @XX3_at xa=%xx_xa XVI16GER2SPP 111011 ... -- ..... ..... 00101010 ..- @XX3_at xa=%xx_xa + +XVF32GER 111011 ... -- ..... ..... 00011011 ..- @XX3_at xa=%xx_xa +XVF32GERPP 111011 ... -- ..... ..... 00011010 ..- @XX3_at xa=%xx_xa +XVF32GERPN 111011 ... -- ..... ..... 10011010 ..- @XX3_at xa=%xx_xa +XVF32GERNP 111011 ... -- ..... ..... 01011010 ..- @XX3_at xa=%xx_xa +XVF32GERNN 111011 ... -- ..... ..... 11011010 ..- @XX3_at xa=%xx_xa + +XVF64GER 111011 ... -- .... 0 ..... 00111011 ..- @XX3_at xa=%xx_xa_pair +XVF64GERPP 111011 ... -- .... 0 ..... 00111010 ..- @XX3_at xa=%xx_xa_pair +XVF64GERPN 111011 ... -- .... 0 ..... 10111010 ..- @XX3_at xa=%xx_xa_pair +XVF64GERNP 111011 ... -- .... 0 ..... 01111010 ..- @XX3_at xa=%xx_xa_pair +XVF64GERNN 111011 ... -- .... 0 ..... 11111010 ..- @XX3_at xa=%xx_xa_pair diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index eb7b8cb0c6..b1fb0f31f3 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -2835,6 +2835,19 @@ static uint32_t pack_flags_xvi(int acc, int sat) return (sat << 1) | acc; } +/* + * Packed VSX Floating Point GER Flags + * 000 - no accumulation no saturation + * 001 - positive accumulate, positive multiply + * 011 - positive accumulate, negative multiply + * 101 - negative accumulate, positive multiply + * 111 - negative accumulate, negative multiply + */ +static inline uint32_t ger_pack_flags_xvf(bool acc, bool nm, bool na) +{ + return (acc ? 0x1 : 0) | (nm ? 0x2 : 0) | (na ? 0x4 : 0); +} + static bool do_ger_MMIRR_XX3(DisasContext *ctx, arg_MMIRR_XX3 *a, uint32_t op, void (*helper)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32)) @@ -2897,6 +2910,32 @@ TRANS64(PMXVI16GER2SPP, do_ger_MMIRR_XX3, GER_SPP, gen_helper_XVI16GER2) #undef GER_SAT #undef GER_SPP +/* To keep line size < 80 */ +#define GER_NOP ger_pack_flags_xvf(false, false, false) +#define GER_PP ger_pack_flags_xvf(true, false, false) +#define GER_NP ger_pack_flags_xvf(true, true, false) +#define GER_PN ger_pack_flags_xvf(true, false, true) +#define GER_NN ger_pack_flags_xvf(true, true, true) + +TRANS(XVF32GER, do_ger_XX3, GER_NOP, gen_helper_XVF32GER) +TRANS(XVF32GERPP, do_ger_XX3, GER_PP, gen_helper_XVF32GER) +TRANS(XVF32GERPN, do_ger_XX3, GER_PN, gen_helper_XVF32GER) +TRANS(XVF32GERNP, do_ger_XX3, GER_NP, gen_helper_XVF32GER) +TRANS(XVF32GERNN, do_ger_XX3, GER_NN, gen_helper_XVF32GER) + +TRANS(XVF64GER, do_ger_XX3, GER_NOP, gen_helper_XVF64GER) +TRANS(XVF64GERPP, do_ger_XX3, GER_PP, gen_helper_XVF64GER) +TRANS(XVF64GERPN, do_ger_XX3, GER_PN, gen_helper_XVF64GER) +TRANS(XVF64GERNP, do_ger_XX3, GER_NP, gen_helper_XVF64GER) +TRANS(XVF64GERNN, do_ger_XX3, GER_NN, gen_helper_XVF64GER) + + +#undef GER_NOP +#undef GER_PP +#undef GER_NP +#undef GER_PN +#undef GER_NN + #undef GEN_XX2FORM #undef GEN_XX3FORM #undef GEN_XX2IFORM From patchwork Tue Apr 26 12:50:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lucas Mateus Martins Araujo e Castro X-Patchwork-Id: 12827054 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 39109C433EF for ; Tue, 26 Apr 2022 12:57:24 +0000 (UTC) Received: from localhost ([::1]:46598 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKkt-0000Ct-5j for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 08:57:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45426) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njKeh-0004Ya-Hy; Tue, 26 Apr 2022 08:50:59 -0400 Received: from [187.72.171.209] (port=32914 helo=outlook.eldorado.org.br) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKef-00005X-Nr; Tue, 26 Apr 2022 08:50:59 -0400 Received: from p9ibm ([10.10.71.235]) by outlook.eldorado.org.br over TLS secured channel with Microsoft SMTPSVC(8.5.9600.16384); Tue, 26 Apr 2022 09:50:37 -0300 Received: from eldorado.org.br (unknown [10.10.70.45]) by p9ibm (Postfix) with ESMTP id 803A8801008; Tue, 26 Apr 2022 09:50:37 -0300 (-03) From: "Lucas Mateus Castro(alqotel)" To: qemu-ppc@nongnu.org Subject: [RFC PATCH 5/7] target/ppc: Implemented xvf16ger* Date: Tue, 26 Apr 2022 09:50:26 -0300 Message-Id: <20220426125028.18844-6-lucas.araujo@eldorado.org.br> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> References: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> MIME-Version: 1.0 X-OriginalArrivalTime: 26 Apr 2022 12:50:37.0787 (UTC) FILETIME=[3A5BE2B0:01D8596C] X-Host-Lookup-Failed: Reverse DNS lookup failed for 187.72.171.209 (failed) Received-SPF: pass client-ip=187.72.171.209; envelope-from=lucas.araujo@eldorado.org.br; helo=outlook.eldorado.org.br X-Spam_score_int: -4 X-Spam_score: -0.5 X-Spam_bar: / X-Spam_report: (-0.5 / 5.0 requ) BAYES_00=-1.9, PDS_HP_HELO_NORDNS=0.659, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Daniel Henrique Barboza , richard.henderson@linaro.org, Greg Kurz , "open list:All patches CC here" , "Lucas Mateus Castro \(alqotel\)" , =?utf-8?q?C=C3=A9dric_Le_Goater?= , =?utf-8?q?Alex_Benn=C3=A9?= =?utf-8?q?e?= , Aurelien Jarno , David Gibson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: "Lucas Mateus Castro (alqotel)" Implement the following PowerISA v3.1 instructions: xvf16ger2: VSX Vector 16-bit Floating-Point GER (rank-2 update) xvf16ger2nn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Negative accumulate xvf16ger2np: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Positive accumulate xvf16ger2pn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Negative accumulate xvf16ger2pp: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) --- include/fpu/softfloat.h | 9 ++++ target/ppc/cpu.h | 3 ++ target/ppc/fpu_helper.c | 65 +++++++++++++++++++++++++++++ target/ppc/helper.h | 1 + target/ppc/insn32.decode | 6 +++ target/ppc/translate/vsx-impl.c.inc | 6 +++ 6 files changed, 90 insertions(+) diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index 3dcf20e3a2..63d7ff18f0 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -619,6 +619,15 @@ static inline float32 float32_chs(float32 a) return make_float32(float32_val(a) ^ 0x80000000); } +static inline float32 float32_neg(float32 a) +{ + if (((a & 0x7f800000) == 0x7f800000) && (a & 0x007fffff)) { + return a; + } else { + return float32_chs(a); + } +} + static inline bool float32_is_infinity(float32 a) { return (float32_val(a) & 0x7fffffff) == 0x7f800000; diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index b5d7b35dda..91167f8cc0 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -225,6 +225,7 @@ typedef union _ppc_vsr_t { int16_t s16[8]; int32_t s32[4]; int64_t s64[2]; + float16 f16[8]; float32 f32[4]; float64 f64[2]; float128 f128; @@ -2652,6 +2653,7 @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx) #define VsrSW(i) s32[i] #define VsrD(i) u64[i] #define VsrSD(i) s64[i] +#define VsrHF(i) f16[i] #define VsrSF(i) f32[i] #define VsrDF(i) f64[i] #else @@ -2663,6 +2665,7 @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx) #define VsrSW(i) s32[3 - (i)] #define VsrD(i) u64[1 - (i)] #define VsrSD(i) s64[1 - (i)] +#define VsrHF(i) f16[7 - (i)] #define VsrSF(i) f32[3 - (i)] #define VsrDF(i) f64[1 - (i)] #endif diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index 6b03666d09..c3aead642a 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -3478,6 +3478,67 @@ static inline bool ger_neg_acc_flag(uint32_t flag) return flag & 0x4; } +#define float16_to_float32(A, PTR) float16_to_float32(A, true, PTR) + +#define GET_VSR(VSR, A, I, SRC_T, TARGET_T) \ + SRC_T##_to_##TARGET_T(A->VSR(I), excp_ptr) + +#define VSXGER16(NAME, ORIG_T, OR_EL) \ + void NAME(CPUPPCState *env, uint32_t a_r, uint32_t b_r, \ + uint32_t at_r, uint32_t mask, uint32_t packed_flags) \ + { \ + ppc_vsr_t *at; \ + float32 psum, aux_acc, va, vb, vc, vd; \ + int i, j, xmsk_bit, ymsk_bit; \ + uint8_t xmsk = mask & 0x0F; \ + uint8_t ymsk = (mask >> 4) & 0x0F; \ + uint8_t pmsk = (mask >> 8) & 0x3; \ + ppc_vsr_t *b = cpu_vsr_ptr(env, b_r); \ + ppc_vsr_t *a = cpu_vsr_ptr(env, a_r); \ + float_status *excp_ptr = &env->fp_status; \ + bool acc = ger_acc_flag(packed_flags); \ + bool neg_acc = ger_neg_acc_flag(packed_flags); \ + bool neg_mul = ger_neg_mul_flag(packed_flags); \ + for (i = 0, xmsk_bit = 1 << 3; i < 4; i++, xmsk_bit >>= 1) { \ + at = cpu_vsr_ptr(env, at_r + i); \ + for (j = 0, ymsk_bit = 1 << 3; j < 4; j++, ymsk_bit >>= 1) {\ + if ((xmsk_bit & xmsk) && (ymsk_bit & ymsk)) { \ + va = !(pmsk & 2) ? float32_zero : \ + GET_VSR(Vsr##OR_EL, a, \ + 2 * i, ORIG_T, float32); \ + vb = !(pmsk & 2) ? float32_zero : \ + GET_VSR(Vsr##OR_EL, b, \ + 2 * j, ORIG_T, float32); \ + vc = !(pmsk & 1) ? float32_zero : \ + GET_VSR(Vsr##OR_EL, a, \ + 2 * i + 1, ORIG_T, float32);\ + vd = !(pmsk & 1) ? float32_zero : \ + GET_VSR(Vsr##OR_EL, b, \ + 2 * j + 1, ORIG_T, float32);\ + psum = float32_mul(va, vb, excp_ptr); \ + psum = float32_muladd(vc, vd, psum, 0, excp_ptr); \ + if (acc) { \ + if (neg_mul) { \ + psum = float32_neg(psum); \ + } \ + if (neg_acc) { \ + aux_acc = float32_neg(at->VsrSF(j)); \ + } else { \ + aux_acc = at->VsrSF(j); \ + } \ + at->VsrSF(j) = float32_add(psum, aux_acc, \ + excp_ptr); \ + } else { \ + at->VsrSF(j) = psum; \ + } \ + } else { \ + at->VsrSF(j) = 0; \ + } \ + } \ + } \ + do_float_check_status(env, GETPC()); \ + } + #define VSXGER(NAME, TYPE, EL) \ void NAME(CPUPPCState *env, uint32_t a_r, uint32_t b_r, \ uint32_t at_r, uint32_t mask, uint32_t packed_flags) \ @@ -3522,7 +3583,11 @@ static inline bool ger_neg_acc_flag(uint32_t flag) do_float_check_status(env, GETPC()); \ } +VSXGER16(helper_XVF16GER2, float16, HF) VSXGER(helper_XVF32GER, float32, SF) VSXGER(helper_XVF64GER, float64, DF) +#undef VSXGER16 #undef VSXGER +#undef GET_VSR +#undef float16_to_float32 diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 7d725292b1..cc59a3b71d 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -540,6 +540,7 @@ DEF_HELPER_5(XXBLENDVD, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_6(XVI4GER8, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(XVI8GER4, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(XVI16GER2, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(XVF16GER2, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(XVF32GER, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(XVF64GER, void, env, i32, i32, i32, i32, i32) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index 9652ca286c..a204730d1d 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -736,6 +736,12 @@ XVI8GER4SPP 111011 ... -- ..... ..... 01100011 ..- @XX3_at xa=%xx_xa XVI16GER2S 111011 ... -- ..... ..... 00101011 ..- @XX3_at xa=%xx_xa XVI16GER2SPP 111011 ... -- ..... ..... 00101010 ..- @XX3_at xa=%xx_xa +XVF16GER2 111011 ... -- ..... ..... 00010011 ..- @XX3_at xa=%xx_xa +XVF16GER2PP 111011 ... -- ..... ..... 00010010 ..- @XX3_at xa=%xx_xa +XVF16GER2PN 111011 ... -- ..... ..... 10010010 ..- @XX3_at xa=%xx_xa +XVF16GER2NP 111011 ... -- ..... ..... 01010010 ..- @XX3_at xa=%xx_xa +XVF16GER2NN 111011 ... -- ..... ..... 11010010 ..- @XX3_at xa=%xx_xa + XVF32GER 111011 ... -- ..... ..... 00011011 ..- @XX3_at xa=%xx_xa XVF32GERPP 111011 ... -- ..... ..... 00011010 ..- @XX3_at xa=%xx_xa XVF32GERPN 111011 ... -- ..... ..... 10011010 ..- @XX3_at xa=%xx_xa diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index b1fb0f31f3..9285e27159 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -2917,6 +2917,12 @@ TRANS64(PMXVI16GER2SPP, do_ger_MMIRR_XX3, GER_SPP, gen_helper_XVI16GER2) #define GER_PN ger_pack_flags_xvf(true, false, true) #define GER_NN ger_pack_flags_xvf(true, true, true) +TRANS(XVF16GER2, do_ger_XX3, GER_NOP, gen_helper_XVF16GER2) +TRANS(XVF16GER2PP, do_ger_XX3, GER_PP, gen_helper_XVF16GER2) +TRANS(XVF16GER2PN, do_ger_XX3, GER_PN, gen_helper_XVF16GER2) +TRANS(XVF16GER2NP, do_ger_XX3, GER_NP, gen_helper_XVF16GER2) +TRANS(XVF16GER2NN, do_ger_XX3, GER_NN, gen_helper_XVF16GER2) + TRANS(XVF32GER, do_ger_XX3, GER_NOP, gen_helper_XVF32GER) TRANS(XVF32GERPP, do_ger_XX3, GER_PP, gen_helper_XVF32GER) TRANS(XVF32GERPN, do_ger_XX3, GER_PN, gen_helper_XVF32GER) From patchwork Tue Apr 26 12:50:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lucas Mateus Martins Araujo e Castro X-Patchwork-Id: 12827073 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3F210C433F5 for ; Tue, 26 Apr 2022 13:02:50 +0000 (UTC) Received: from localhost ([::1]:54180 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKq9-0005ZF-15 for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 09:02:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45446) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njKek-0004gZ-Bh; Tue, 26 Apr 2022 08:51:02 -0400 Received: from [187.72.171.209] (port=32914 helo=outlook.eldorado.org.br) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKei-00005X-IB; Tue, 26 Apr 2022 08:51:02 -0400 Received: from p9ibm ([10.10.71.235]) by outlook.eldorado.org.br over TLS secured channel with Microsoft SMTPSVC(8.5.9600.16384); Tue, 26 Apr 2022 09:50:38 -0300 Received: from eldorado.org.br (unknown [10.10.70.45]) by p9ibm (Postfix) with ESMTP id B421080009B; Tue, 26 Apr 2022 09:50:37 -0300 (-03) From: "Lucas Mateus Castro(alqotel)" To: qemu-ppc@nongnu.org Subject: [RFC PATCH 6/7] target/ppc: Implemented pmxvf*ger* Date: Tue, 26 Apr 2022 09:50:27 -0300 Message-Id: <20220426125028.18844-7-lucas.araujo@eldorado.org.br> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> References: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> MIME-Version: 1.0 X-OriginalArrivalTime: 26 Apr 2022 12:50:38.0272 (UTC) FILETIME=[3AA5E400:01D8596C] X-Host-Lookup-Failed: Reverse DNS lookup failed for 187.72.171.209 (failed) Received-SPF: pass client-ip=187.72.171.209; envelope-from=lucas.araujo@eldorado.org.br; helo=outlook.eldorado.org.br X-Spam_score_int: -4 X-Spam_score: -0.5 X-Spam_bar: / X-Spam_report: (-0.5 / 5.0 requ) BAYES_00=-1.9, PDS_HP_HELO_NORDNS=0.659, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Henrique Barboza , richard.henderson@linaro.org, Greg Kurz , "open list:All patches CC here" , "Lucas Mateus Castro \(alqotel\)" , =?utf-8?q?C=C3=A9dric_Le_Goater?= , David Gibson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: "Lucas Mateus Castro (alqotel)" Implement the following PowerISA v3.1 instructions: pmxvf16ger2: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) pmxvf16ger2nn: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Negative accumulate pmxvf16ger2np: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Positive accumulate pmxvf16ger2pn: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Negative accumulate pmxvf16ger2pp: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Positive accumulate pmxvf32ger: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) pmxvf32gernn: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate pmxvf32gernp: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate pmxvf32gerpn: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate pmxvf32gerpp: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate pmxvf64ger: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) pmxvf64gernn: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate pmxvf64gernp: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate pmxvf64gerpn: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate pmxvf64gerpp: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) --- target/ppc/insn64.decode | 39 +++++++++++++++++++++++++++++ target/ppc/translate/vsx-impl.c.inc | 33 ++++++++++++++++++++++++ 2 files changed, 72 insertions(+) diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode index 18915f1977..bc5e4dfe1a 100644 --- a/target/ppc/insn64.decode +++ b/target/ppc/insn64.decode @@ -73,10 +73,16 @@ %xx3_xa 2:1 16:5 %xx3_xb 1:1 11:5 %xx3_at 23:3 !function=times_4 +%xx3_xa_pair 2:1 17:4 !function=times_2 @MMIRR_XX3 ...... .. .... .. . . ........ xmsk:4 ymsk:4 \ ...... ... .. ..... ..... ........ ... \ &MMIRR_XX3 xa=%xx3_xa xb=%xx3_xb xt=%xx3_at +&MMIRR_XX3_NO_P xa xb xt xmsk ymsk +@MMIRR_XX3_NO_P ...... .. .... .. . . ........ xmsk:4 .... \ + ...... ... .. ..... ..... ........ ... \ + &MMIRR_XX3_NO_P xb=%xx3_xb xt=%xx3_at + ### Fixed-Point Load Instructions PLBZ 000001 10 0--.-- .................. \ @@ -145,6 +151,39 @@ PMXVI16GER2S 000001 11 1001 -- - - pmsk:2 ------ ........ \ PMXVI16GER2SPP 000001 11 1001 -- - - pmsk:2 ------ ........ \ 111011 ... -- ..... ..... 00101010 ..- @MMIRR_XX3 +PMXVF16GER2 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 00010011 ..- @MMIRR_XX3 +PMXVF16GER2PP 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 00010010 ..- @MMIRR_XX3 +PMXVF16GER2PN 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 10010010 ..- @MMIRR_XX3 +PMXVF16GER2NP 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 01010010 ..- @MMIRR_XX3 +PMXVF16GER2NN 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 11010010 ..- @MMIRR_XX3 + +PMXVF32GER 000001 11 1001 -- - - -------- .... ymsk:4 \ + 111011 ... -- ..... ..... 00011011 ..- @MMIRR_XX3_NO_P xa=%xx3_xa +PMXVF32GERPP 000001 11 1001 -- - - -------- .... ymsk:4 \ + 111011 ... -- ..... ..... 00011010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa +PMXVF32GERPN 000001 11 1001 -- - - -------- .... ymsk:4 \ + 111011 ... -- ..... ..... 10011010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa +PMXVF32GERNP 000001 11 1001 -- - - -------- .... ymsk:4 \ + 111011 ... -- ..... ..... 01011010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa +PMXVF32GERNN 000001 11 1001 -- - - -------- .... ymsk:4 \ + 111011 ... -- ..... ..... 11011010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa + +PMXVF64GER 000001 11 1001 -- - - -------- .... ymsk:2 -- \ + 111011 ... -- ....0 ..... 00111011 ..- @MMIRR_XX3_NO_P xa=%xx3_xa_pair +PMXVF64GERPP 000001 11 1001 -- - - -------- .... ymsk:2 -- \ + 111011 ... -- ....0 ..... 00111010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa_pair +PMXVF64GERPN 000001 11 1001 -- - - -------- .... ymsk:2 -- \ + 111011 ... -- ....0 ..... 10111010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa_pair +PMXVF64GERNP 000001 11 1001 -- - - -------- .... ymsk:2 -- \ + 111011 ... -- ....0 ..... 01111010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa_pair +PMXVF64GERNN 000001 11 1001 -- - - -------- .... ymsk:2 -- \ + 111011 ... -- ....0 ..... 11111010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa_pair + ### Prefixed No-operation Instruction @PNOP 000001 11 0000-- 000000000000000000 \ diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index 9285e27159..06f5c1220d 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -2867,6 +2867,22 @@ static bool do_ger_MMIRR_XX3(DisasContext *ctx, arg_MMIRR_XX3 *a, uint32_t op, return true; } + +static bool do_ger_MMIRR_XX3_NO_PMSK(DisasContext *ctx, arg_MMIRR_XX3_NO_P *a, + int op_flag, void (*helper)(TCGv_env, + TCGv_i32, TCGv_i32, TCGv_i32, + TCGv_i32, TCGv_i32)) +{ + arg_MMIRR_XX3 x; + x.xa = a->xa; + x.xb = a->xb; + x.xt = a->xt; + x.pmsk = 0x1; + x.ymsk = a->ymsk; + x.xmsk = a->xmsk; + return do_ger_MMIRR_XX3(ctx, &x, op_flag, helper); +} + static bool do_ger_XX3(DisasContext *ctx, arg_XX3 *a, uint32_t op_flags, void (*helper)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32)) @@ -2935,6 +2951,23 @@ TRANS(XVF64GERPN, do_ger_XX3, GER_PN, gen_helper_XVF64GER) TRANS(XVF64GERNP, do_ger_XX3, GER_NP, gen_helper_XVF64GER) TRANS(XVF64GERNN, do_ger_XX3, GER_NN, gen_helper_XVF64GER) +TRANS64(PMXVF16GER2, do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVF16GER2) +TRANS64(PMXVF16GER2PP, do_ger_MMIRR_XX3, GER_PP, gen_helper_XVF16GER2) +TRANS64(PMXVF16GER2PN, do_ger_MMIRR_XX3, GER_PN, gen_helper_XVF16GER2) +TRANS64(PMXVF16GER2NP, do_ger_MMIRR_XX3, GER_NP, gen_helper_XVF16GER2) +TRANS64(PMXVF16GER2NN, do_ger_MMIRR_XX3, GER_NN, gen_helper_XVF16GER2) + +TRANS64(PMXVF32GER, do_ger_MMIRR_XX3_NO_PMSK, GER_NOP, gen_helper_XVF32GER) +TRANS64(PMXVF32GERPP, do_ger_MMIRR_XX3_NO_PMSK, GER_PP, gen_helper_XVF32GER) +TRANS64(PMXVF32GERPN, do_ger_MMIRR_XX3_NO_PMSK, GER_PN, gen_helper_XVF32GER) +TRANS64(PMXVF32GERNP, do_ger_MMIRR_XX3_NO_PMSK, GER_NP, gen_helper_XVF32GER) +TRANS64(PMXVF32GERNN, do_ger_MMIRR_XX3_NO_PMSK, GER_NN, gen_helper_XVF32GER) + +TRANS64(PMXVF64GER, do_ger_MMIRR_XX3_NO_PMSK, GER_NOP, gen_helper_XVF64GER) +TRANS64(PMXVF64GERPP, do_ger_MMIRR_XX3_NO_PMSK, GER_PP, gen_helper_XVF64GER) +TRANS64(PMXVF64GERPN, do_ger_MMIRR_XX3_NO_PMSK, GER_PN, gen_helper_XVF64GER) +TRANS64(PMXVF64GERNP, do_ger_MMIRR_XX3_NO_PMSK, GER_NP, gen_helper_XVF64GER) +TRANS64(PMXVF64GERNN, do_ger_MMIRR_XX3_NO_PMSK, GER_NN, gen_helper_XVF64GER) #undef GER_NOP #undef GER_PP From patchwork Tue Apr 26 12:50:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lucas Mateus Martins Araujo e Castro X-Patchwork-Id: 12827120 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 86E31C433F5 for ; Tue, 26 Apr 2022 13:09:39 +0000 (UTC) Received: from localhost ([::1]:44064 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKwk-0001S2-IG for qemu-devel@archiver.kernel.org; Tue, 26 Apr 2022 09:09:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45462) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njKen-0004pV-59; Tue, 26 Apr 2022 08:51:05 -0400 Received: from [187.72.171.209] (port=32914 helo=outlook.eldorado.org.br) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njKel-00005X-Az; Tue, 26 Apr 2022 08:51:04 -0400 Received: from p9ibm ([10.10.71.235]) by outlook.eldorado.org.br over TLS secured channel with Microsoft SMTPSVC(8.5.9600.16384); Tue, 26 Apr 2022 09:50:38 -0300 Received: from eldorado.org.br (unknown [10.10.70.45]) by p9ibm (Postfix) with ESMTP id E1CB2801008; Tue, 26 Apr 2022 09:50:37 -0300 (-03) From: "Lucas Mateus Castro(alqotel)" To: qemu-ppc@nongnu.org Subject: [RFC PATCH 7/7] target/ppc: Implemented [pm]xvbf16ger2* Date: Tue, 26 Apr 2022 09:50:28 -0300 Message-Id: <20220426125028.18844-8-lucas.araujo@eldorado.org.br> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> References: <20220426125028.18844-1-lucas.araujo@eldorado.org.br> MIME-Version: 1.0 X-OriginalArrivalTime: 26 Apr 2022 12:50:38.0459 (UTC) FILETIME=[3AC26CB0:01D8596C] X-Host-Lookup-Failed: Reverse DNS lookup failed for 187.72.171.209 (failed) Received-SPF: pass client-ip=187.72.171.209; envelope-from=lucas.araujo@eldorado.org.br; helo=outlook.eldorado.org.br X-Spam_score_int: -4 X-Spam_score: -0.5 X-Spam_bar: / X-Spam_report: (-0.5 / 5.0 requ) BAYES_00=-1.9, PDS_HP_HELO_NORDNS=0.659, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Henrique Barboza , richard.henderson@linaro.org, Greg Kurz , "open list:All patches CC here" , "Lucas Mateus Castro \(alqotel\)" , =?utf-8?q?C=C3=A9dric_Le_Goater?= , David Gibson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: "Lucas Mateus Castro (alqotel)" Implement the following PowerISA v3.1 instructions: xvbf16ger2: VSX Vector bfloat16 GER (rank-2 update) xvbf16ger2nn: VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Negative accumulate xvbf16ger2np: VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Positive accumulate xvbf16ger2pn: VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Negative accumulate xvbf16ger2pp: VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Positive accumulate pmxvbf16ger2: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) pmxvbf16ger2nn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Negative accumulate pmxvbf16ger2np: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Positive accumulate pmxvbf16ger2pn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Negative accumulate pmxvbf16ger2pp: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) --- There's a discrepancy between this implementation and mambo/the hardware where implementing it with float32_mul then float32_muladd results in incorrect signal in some 0 or infinite results, but implementing with a multiplication then muladd using FloatParts64 results in a different result in operations where an underflow would've ocurred in the first multiplication if it was rounded to 32 bits. I've not been able to solve this --- target/ppc/cpu.h | 3 +++ target/ppc/fpu_helper.c | 1 + target/ppc/helper.h | 1 + target/ppc/insn32.decode | 6 ++++++ target/ppc/insn64.decode | 11 +++++++++++ target/ppc/translate/vsx-impl.c.inc | 12 ++++++++++++ 6 files changed, 34 insertions(+) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index 91167f8cc0..10780adf65 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -225,6 +225,7 @@ typedef union _ppc_vsr_t { int16_t s16[8]; int32_t s32[4]; int64_t s64[2]; + bfloat16 bf16[8]; float16 f16[8]; float32 f32[4]; float64 f64[2]; @@ -2653,6 +2654,7 @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx) #define VsrSW(i) s32[i] #define VsrD(i) u64[i] #define VsrSD(i) s64[i] +#define VsrBF(i) bf16[i] #define VsrHF(i) f16[i] #define VsrSF(i) f32[i] #define VsrDF(i) f64[i] @@ -2665,6 +2667,7 @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx) #define VsrSW(i) s32[3 - (i)] #define VsrD(i) u64[1 - (i)] #define VsrSD(i) s64[1 - (i)] +#define VsrBF(i) bf16[7 - (i)] #define VsrHF(i) f16[7 - (i)] #define VsrSF(i) f32[3 - (i)] #define VsrDF(i) f64[1 - (i)] diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index c3aead642a..9acba0f804 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -3583,6 +3583,7 @@ static inline bool ger_neg_acc_flag(uint32_t flag) do_float_check_status(env, GETPC()); \ } +VSXGER16(helper_XVBF16GER2, bfloat16, BF) VSXGER16(helper_XVF16GER2, float16, HF) VSXGER(helper_XVF32GER, float32, SF) VSXGER(helper_XVF64GER, float64, DF) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index cc59a3b71d..68748ecc03 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -540,6 +540,7 @@ DEF_HELPER_5(XXBLENDVD, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_6(XVI4GER8, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(XVI8GER4, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(XVI16GER2, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(XVBF16GER2, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(XVF16GER2, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(XVF32GER, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(XVF64GER, void, env, i32, i32, i32, i32, i32) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index a204730d1d..fff6e406f0 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -736,6 +736,12 @@ XVI8GER4SPP 111011 ... -- ..... ..... 01100011 ..- @XX3_at xa=%xx_xa XVI16GER2S 111011 ... -- ..... ..... 00101011 ..- @XX3_at xa=%xx_xa XVI16GER2SPP 111011 ... -- ..... ..... 00101010 ..- @XX3_at xa=%xx_xa +XVBF16GER2 111011 ... -- ..... ..... 00110011 ..- @XX3_at xa=%xx_xa +XVBF16GER2PP 111011 ... -- ..... ..... 00110010 ..- @XX3_at xa=%xx_xa +XVBF16GER2PN 111011 ... -- ..... ..... 10110010 ..- @XX3_at xa=%xx_xa +XVBF16GER2NP 111011 ... -- ..... ..... 01110010 ..- @XX3_at xa=%xx_xa +XVBF16GER2NN 111011 ... -- ..... ..... 11110010 ..- @XX3_at xa=%xx_xa + XVF16GER2 111011 ... -- ..... ..... 00010011 ..- @XX3_at xa=%xx_xa XVF16GER2PP 111011 ... -- ..... ..... 00010010 ..- @XX3_at xa=%xx_xa XVF16GER2PN 111011 ... -- ..... ..... 10010010 ..- @XX3_at xa=%xx_xa diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode index bc5e4dfe1a..4cd6219ad5 100644 --- a/target/ppc/insn64.decode +++ b/target/ppc/insn64.decode @@ -151,6 +151,17 @@ PMXVI16GER2S 000001 11 1001 -- - - pmsk:2 ------ ........ \ PMXVI16GER2SPP 000001 11 1001 -- - - pmsk:2 ------ ........ \ 111011 ... -- ..... ..... 00101010 ..- @MMIRR_XX3 +PMXVBF16GER2 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 00110011 ..- @MMIRR_XX3 +PMXVBF16GER2PP 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 00110010 ..- @MMIRR_XX3 +PMXVBF16GER2PN 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 10110010 ..- @MMIRR_XX3 +PMXVBF16GER2NP 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 01110010 ..- @MMIRR_XX3 +PMXVBF16GER2NN 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 11110010 ..- @MMIRR_XX3 + PMXVF16GER2 000001 11 1001 -- - - pmsk:2 ------ ........ \ 111011 ... -- ..... ..... 00010011 ..- @MMIRR_XX3 PMXVF16GER2PP 000001 11 1001 -- - - pmsk:2 ------ ........ \ diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index 06f5c1220d..bb5e6f0693 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -2933,6 +2933,12 @@ TRANS64(PMXVI16GER2SPP, do_ger_MMIRR_XX3, GER_SPP, gen_helper_XVI16GER2) #define GER_PN ger_pack_flags_xvf(true, false, true) #define GER_NN ger_pack_flags_xvf(true, true, true) +TRANS(XVBF16GER2, do_ger_XX3, GER_NOP, gen_helper_XVBF16GER2) +TRANS(XVBF16GER2PP, do_ger_XX3, GER_PP, gen_helper_XVBF16GER2) +TRANS(XVBF16GER2PN, do_ger_XX3, GER_PN, gen_helper_XVBF16GER2) +TRANS(XVBF16GER2NP, do_ger_XX3, GER_NP, gen_helper_XVBF16GER2) +TRANS(XVBF16GER2NN, do_ger_XX3, GER_NN, gen_helper_XVBF16GER2) + TRANS(XVF16GER2, do_ger_XX3, GER_NOP, gen_helper_XVF16GER2) TRANS(XVF16GER2PP, do_ger_XX3, GER_PP, gen_helper_XVF16GER2) TRANS(XVF16GER2PN, do_ger_XX3, GER_PN, gen_helper_XVF16GER2) @@ -2957,6 +2963,12 @@ TRANS64(PMXVF16GER2PN, do_ger_MMIRR_XX3, GER_PN, gen_helper_XVF16GER2) TRANS64(PMXVF16GER2NP, do_ger_MMIRR_XX3, GER_NP, gen_helper_XVF16GER2) TRANS64(PMXVF16GER2NN, do_ger_MMIRR_XX3, GER_NN, gen_helper_XVF16GER2) +TRANS64(PMXVBF16GER2, do_ger_MMIRR_XX3, GER_NOP, gen_helper_XVBF16GER2) +TRANS64(PMXVBF16GER2PP, do_ger_MMIRR_XX3, GER_PP, gen_helper_XVBF16GER2) +TRANS64(PMXVBF16GER2PN, do_ger_MMIRR_XX3, GER_PN, gen_helper_XVBF16GER2) +TRANS64(PMXVBF16GER2NP, do_ger_MMIRR_XX3, GER_NP, gen_helper_XVBF16GER2) +TRANS64(PMXVBF16GER2NN, do_ger_MMIRR_XX3, GER_NN, gen_helper_XVBF16GER2) + TRANS64(PMXVF32GER, do_ger_MMIRR_XX3_NO_PMSK, GER_NOP, gen_helper_XVF32GER) TRANS64(PMXVF32GERPP, do_ger_MMIRR_XX3_NO_PMSK, GER_PP, gen_helper_XVF32GER) TRANS64(PMXVF32GERPN, do_ger_MMIRR_XX3_NO_PMSK, GER_PN, gen_helper_XVF32GER)