From patchwork Wed Apr 17 15:33:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mateja Marjanovic X-Patchwork-Id: 10905565 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5EF6D922 for ; Wed, 17 Apr 2019 15:35:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 44DDE287AA for ; Wed, 17 Apr 2019 15:35:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 392AA28998; Wed, 17 Apr 2019 15:35:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8A696287AA for ; Wed, 17 Apr 2019 15:35:31 +0000 (UTC) Received: from localhost ([127.0.0.1]:55460 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hGmaw-0001eY-NO for patchwork-qemu-devel@patchwork.kernel.org; Wed, 17 Apr 2019 11:35:30 -0400 Received: from eggs.gnu.org ([209.51.188.92]:34523) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hGmZB-0000L9-8q for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hGmZ8-0004lB-A8 for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:41 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:34553 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hGmZ7-0004jO-E4 for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:37 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 5F7AA1A229F; Wed, 17 Apr 2019 17:33:34 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id 3E70D1A2471; Wed, 17 Apr 2019 17:33:34 +0200 (CEST) From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Wed, 17 Apr 2019 17:33:21 +0200 Message-Id: <1555515206-9352-2-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555515206-9352-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555515206-9352-1-git-send-email-mateja.marjanovic@rt-rk.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v7 1/6] target/mips: Optimize ILVOD. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Mateja Marjanovic Optimize set of MSA instructions ILVOD., using directly tcg registers and performing logic on them instead of using helpers. In the following table, the first column is the performance before this patch. The second represents the performance after converting from helpers to tcg, but without using tcg_gen_deposit function. The third one is with the deposit function and with using a uint64_t constant bit mask, and the fourth is with the deposit function and with a mask which is a tcg constant. The fourth is implemented in this patch. Performance measurement is done by executing the instructions 10 million times on a computer with Intel Core i7-3770 CPU @ 3.40GHz×8. Reviewed-by: Richard Henderson ================================================================== || instruction || 1 || 2 || 3 || 4 || ================================================================== || ilvod.b || 117.50 ms || 24.13 ms || 24.45 ms || 23.24 ms || || ilvod.h || 93.16 ms || 24.21 ms || 24.28 ms || 23.20 ms || || ilvod.w || 119.90 ms || 24.15 ms || 23.19 ms || 22.95 ms || || ilvod.d || 43.01 ms || 21.17 ms || 23.07 ms || 22.59 ms || ================================================================== 1 - before 2 - no-deposit-no-mask-as-tcg-constant 3 - with-deposit-no-mask-as-tcg-constant 4 - with-deposit-with-mask-as-tcg-constant (final) The deposit function is used only in ILVOD.W. No-deposit version of the ILVOD.W implementation: static inline void gen_ilvod_w(CPUMIPSState *env, uint32_t wd, uint32_t ws, uint32_t wt) { TCGv_i64 t1 = tcg_temp_new_i64(); TCGv_i64 t2 = tcg_temp_new_i64(); TCGv_i64 mask = tcg_const_i64(0xffffffff00000000ULL); tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask); tcg_gen_shri_i64(t1, t1, 32); tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask); tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask); tcg_gen_shri_i64(t1, t1, 32); tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask); tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); tcg_temp_free_i64(mask); tcg_temp_free_i64(t1); tcg_temp_free_i64(t2); } Suggested-by: Aleksandar Markovic Suggested-by: Philippe Mathieu-Daudé Suggested-by: Richard Henderson Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 1 - target/mips/msa_helper.c | 7 ---- target/mips/translate.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 90 insertions(+), 9 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index 2863f60..02e16c7 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -865,7 +865,6 @@ DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvl_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvev_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvod_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srlr_df, void, env, i32, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index 6c57281..a7ea6aa 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1206,13 +1206,6 @@ MSA_FN_DF(ilvr_df) MSA_FN_DF(ilvev_df) #undef MSA_DO -#define MSA_DO(DF) \ - do { \ - pwx->DF[2*i] = pwt->DF[2*i+1]; \ - pwx->DF[2*i+1] = pws->DF[2*i+1]; \ - } while (0) -MSA_FN_DF(ilvod_df) -#undef MSA_DO #undef MSA_LOOP_COND #define MSA_LOOP_COND(DF) \ diff --git a/target/mips/translate.c b/target/mips/translate.c index bba8b6c..4c8fef0 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28884,6 +28884,80 @@ static void gen_msa_bit(CPUMIPSState *env, DisasContext *ctx) tcg_temp_free_i32(tws); } +/* + * [MSA] ILVOD. wd, ws, wt + * + * Vector Interleave Odd ( data elements) + * + */ +static inline void gen_ilvod_bh(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt, + uint64_t mask, uint32_t shift) +{ + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + TCGv_i64 mask_tcg = tcg_const_i64(mask); + + tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask_tcg); + tcg_gen_shri_i64(t1, t1, shift); + tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask_tcg); + tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); + + tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask_tcg); + tcg_gen_shri_i64(t1, t1, shift); + tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask_tcg); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); + + tcg_temp_free_i64(mask_tcg); + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +static inline void gen_ilvod_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + gen_ilvod_bh(env, wd, ws, wt, 0xff00ff00ff00ff00ULL, 8); +} + +static inline void gen_ilvod_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + gen_ilvod_bh(env, wd, ws, wt, 0xffff0000ffff0000ULL, 16); +} + +/* + * [MSA] ILVOD.W wd, ws, wt + * + * Vector Interleave Odd (word data elements) + * + */ +static inline void gen_ilvod_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t1, msa_wr_d[wt * 2], 32); + tcg_gen_deposit_i64(msa_wr_d[wd * 2], msa_wr_d[ws * 2], t1, 0, 32); + + tcg_gen_shri_i64(t1, msa_wr_d[wt * 2 + 1], 32); + tcg_gen_deposit_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1], t1, 0, 32); + + tcg_temp_free_i64(t1); +} + +/* + * [MSA] ILVOD.D wd, ws, wt + * + * Vector Interleave Odd (doubleword data elements) + * + */ +static inline void gen_ilvod_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2 + 1]); + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); +} + static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) { #define MASK_MSA_3R(op) (MASK_MSA_MINOR(op) | (op & (0x7 << 23))) @@ -29055,7 +29129,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) gen_helper_msa_mod_u_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVOD_df: - gen_helper_msa_ilvod_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_ilvod_b(env, wd, ws, wt); + break; + case DF_HALF: + gen_ilvod_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvod_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvod_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; case OPC_DOTP_S_df: From patchwork Wed Apr 17 15:33:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mateja Marjanovic X-Patchwork-Id: 10905575 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0A24C922 for ; Wed, 17 Apr 2019 15:38:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E2043286B0 for ; Wed, 17 Apr 2019 15:38:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D3B9028763; Wed, 17 Apr 2019 15:38:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 32E60286B0 for ; Wed, 17 Apr 2019 15:38:20 +0000 (UTC) Received: from localhost ([127.0.0.1]:55482 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hGmde-0003jZ-6c for patchwork-qemu-devel@patchwork.kernel.org; Wed, 17 Apr 2019 11:38:18 -0400 Received: from eggs.gnu.org ([209.51.188.92]:34522) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hGmZB-0000L8-8n for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hGmZ9-0004lo-F1 for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:41 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:34560 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hGmZ7-0004jT-Em for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:38 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 6E48B1A2451; Wed, 17 Apr 2019 17:33:34 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id 4B4251A2473; Wed, 17 Apr 2019 17:33:34 +0200 (CEST) From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Wed, 17 Apr 2019 17:33:22 +0200 Message-Id: <1555515206-9352-3-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555515206-9352-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555515206-9352-1-git-send-email-mateja.marjanovic@rt-rk.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v7 2/6] target/mips: Optimize ILVEV. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Mateja Marjanovic Optimize set of MSA instructions ILVEV., using directly tcg registers and performing logic on them instead of using helpers. In the following table, the first column is the performance before this patch. The second represents the performance after converting from helpers to tcg, but without using tcg_gen_deposit function. The third one is with using the tcg_gen_deposit function and with using a uint64_t constant bit mask, and the fourth is with using the tcg_gen_deposit function and with a mask which is a tcg constant. The fourth is implemented in this patch. Performance measurement is done by executing the instructions 10 million times on a computer with Intel Core i7-3770 CPU @ 3.40GHz×8. Reviewed-by: Richard Henderson ================================================================== || instruction || 1 || 2 || 3 || 4 || ================================================================== || ilvev.b || 126.92 ms || 24.52 ms || 25.19 ms || 23.89 ms || || ilvev.h || 93.67 ms || 23.92 ms || 24.76 ms || 24.31 ms || || ilvev.w || 117.86 ms || 23.83 ms || 21.84 ms || 21.99 ms || || ilvev.d || 45.49 ms || 19.74 ms || 20.21 ms || 20.07 ms || ================================================================== 1 - before 2 - no-deposit-no-mask-as-tcg-constant 3 - with-deposit-no-mask-as-tcg-constant 4 - with-deposit-with-mask-as-tcg-constant (final) The deposit function is used only in ILVEV.W. No-deposit version of the ILVEV.W implementation: static inline void gen_ilvev_w(CPUMIPSState *env, uint32_t wd, uint32_t ws, uint32_t wt) { TCGv_i64 t1 = tcg_temp_new_i64(); TCGv_i64 t2 = tcg_temp_new_i64(); uint64_t mask = 0x00000000ffffffffULL; tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); tcg_gen_andi_i64(t2, msa_wr_d[ws * 2], mask); tcg_gen_shli_i64(t2, t2, 32); tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); tcg_gen_andi_i64(t2, msa_wr_d[ws * 2 + 1], mask); tcg_gen_shli_i64(t2, t2, 32); tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); tcg_temp_free_i64(t1); tcg_temp_free_i64(t2); } Suggested-by: Aleksandar Markovic Suggested-by: Philippe Mathieu-Daudé Suggested-by: Richard Henderson Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 1 - target/mips/msa_helper.c | 9 ----- target/mips/translate.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 86 insertions(+), 11 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index 02e16c7..82f6a40 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -864,7 +864,6 @@ DEF_HELPER_5(msa_pckev_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvl_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvev_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srlr_df, void, env, i32, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index a7ea6aa..d5c3842 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1197,15 +1197,6 @@ MSA_FN_DF(ilvl_df) } while (0) MSA_FN_DF(ilvr_df) #undef MSA_DO - -#define MSA_DO(DF) \ - do { \ - pwx->DF[2*i] = pwt->DF[2*i]; \ - pwx->DF[2*i+1] = pws->DF[2*i]; \ - } while (0) -MSA_FN_DF(ilvev_df) -#undef MSA_DO - #undef MSA_LOOP_COND #define MSA_LOOP_COND(DF) \ diff --git a/target/mips/translate.c b/target/mips/translate.c index 4c8fef0..95bcc65 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28958,6 +28958,76 @@ static inline void gen_ilvod_d(CPUMIPSState *env, uint32_t wd, tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); } + +/* + * [MSA] ILVEV. wd, ws, wt + * + * Vector Interleave Even ( data elements) + * + */ +static inline void gen_ilvev_bh(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt, + uint64_t mask, uint32_t shift) +{ + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + TCGv_i64 mask_tcg = tcg_const_i64(mask); + + tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask_tcg); + tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask_tcg); + tcg_gen_shli_i64(t2, t2, shift); + tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); + + tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask_tcg); + tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask_tcg); + tcg_gen_shli_i64(t2, t2, shift); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); + + tcg_temp_free_i64(mask_tcg); + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +static inline void gen_ilvev_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + gen_ilvev_bh(env, wd, ws, wt, 0x00ff00ff00ff00ffULL, 8); +} + +static inline void gen_ilvev_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + gen_ilvev_bh(env, wd, ws, wt, 0x0000ffff0000ffffULL, 16); +} + +/* + * [MSA] ILVEV.W wd, ws, wt + * + * Vector Interleave Even (word data elements) + * + */ +static inline void gen_ilvev_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_deposit_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2], + msa_wr_d[ws * 2], 32, 32); + tcg_gen_deposit_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[wt * 2 + 1], + msa_wr_d[ws * 2 + 1], 32, 32); +} + +/* + * [MSA] ILVEV.D wd, ws, wt + * + * Vector Interleave Even (Doubleword data elements) + * + */ +static inline void gen_ilvev_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]); + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]); +} + static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) { #define MASK_MSA_3R(op) (MASK_MSA_MINOR(op) | (op & (0x7 << 23))) @@ -29114,7 +29184,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) gen_helper_msa_mod_s_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVEV_df: - gen_helper_msa_ilvev_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_ilvev_b(env, wd, ws, wt); + break; + case DF_HALF: + gen_ilvev_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvev_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvev_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; case OPC_BINSR_df: gen_helper_msa_binsr_df(cpu_env, tdf, twd, tws, twt); From patchwork Wed Apr 17 15:33:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mateja Marjanovic X-Patchwork-Id: 10905563 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BBC19922 for ; Wed, 17 Apr 2019 15:35:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9FB38287AA for ; Wed, 17 Apr 2019 15:35:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9406728998; Wed, 17 Apr 2019 15:35:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C516B287AA for ; Wed, 17 Apr 2019 15:35:28 +0000 (UTC) Received: from localhost ([127.0.0.1]:55458 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hGmau-0001cf-2q for patchwork-qemu-devel@patchwork.kernel.org; Wed, 17 Apr 2019 11:35:28 -0400 Received: from eggs.gnu.org ([209.51.188.92]:34527) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hGmZB-0000LA-Rp for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hGmZ9-0004lu-FP for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:41 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:34567 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hGmZ7-0004jU-FF for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:38 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 884021A4130; Wed, 17 Apr 2019 17:33:34 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id 5811E1A2476; Wed, 17 Apr 2019 17:33:34 +0200 (CEST) From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Wed, 17 Apr 2019 17:33:23 +0200 Message-Id: <1555515206-9352-4-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555515206-9352-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555515206-9352-1-git-send-email-mateja.marjanovic@rt-rk.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v7 3/6] target/mips: Optimize ILVL. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Mateja Marjanovic Optimize ILVL. instructions, using a hybrid approach. For byte data elements, use a helper with an unrolled loop (having much better performance than direct tcg translation), for halfword, word and doubleword data elements use directly tcg registers and logic performed on them. Performance measurement is done by executing the instructions 10 million times on a computer with Intel Core i7-3770 CPU @ 3.40GHz×8. ============================================================ ||instruction|| helper || tcg || hybrid || ============================================================ || ilvl.b || 59.91 ms || 74.41 ms || 60.50 ms (helper) || || ilvl.h || 41.33 ms || 33.08 ms || 33.34 ms (tcg) || || ilvl.w || 30.99 ms || 22.87 ms || 23.19 ms (tcg) || || ilvl.d || 26.40 ms || 19.64 ms || 20.49 ms (tcg) || ============================================================ Suggested-by: Aleksandar Markovic Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 3 +- target/mips/msa_helper.c | 33 +++++++++++---- target/mips/translate.c | 106 ++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 132 insertions(+), 10 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index 82f6a40..f737a0e 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -862,7 +862,6 @@ DEF_HELPER_5(msa_sld_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_splat_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckev_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvl_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) @@ -934,6 +933,8 @@ DEF_HELPER_4(msa_pcnt_df, void, env, i32, i32, i32) DEF_HELPER_4(msa_nloc_df, void, env, i32, i32, i32) DEF_HELPER_4(msa_nlzc_df, void, env, i32, i32, i32) +DEF_HELPER_4(msa_ilvl_b, void, env, i32, i32, i32) + DEF_HELPER_4(msa_copy_s_b, void, env, i32, i32, i32) DEF_HELPER_4(msa_copy_s_h, void, env, i32, i32, i32) DEF_HELPER_4(msa_copy_s_w, void, env, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index d5c3842..20ce447 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1184,14 +1184,6 @@ MSA_FN_DF(pckod_df) #define MSA_DO(DF) \ do { \ - pwx->DF[2*i] = L##DF(pwt, i); \ - pwx->DF[2*i+1] = L##DF(pws, i); \ - } while (0) -MSA_FN_DF(ilvl_df) -#undef MSA_DO - -#define MSA_DO(DF) \ - do { \ pwx->DF[2*i] = R##DF(pwt, i); \ pwx->DF[2*i+1] = R##DF(pws, i); \ } while (0) @@ -1214,6 +1206,31 @@ MSA_FN_DF(vshf_df) #undef MSA_LOOP_COND #undef MSA_FN_DF +void helper_msa_ilvl_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + wr_t *pwd = &(env->active_fpu.fpr[wd].wr); + wr_t *pws = &(env->active_fpu.fpr[ws].wr); + wr_t *pwt = &(env->active_fpu.fpr[wt].wr); + + pwd->b[0] = pwt->b[8]; + pwd->b[1] = pws->b[8]; + pwd->b[2] = pwt->b[9]; + pwd->b[3] = pws->b[9]; + pwd->b[4] = pwt->b[10]; + pwd->b[5] = pws->b[10]; + pwd->b[6] = pwt->b[11]; + pwd->b[7] = pws->b[11]; + pwd->b[8] = pwt->b[12]; + pwd->b[9] = pws->b[12]; + pwd->b[10] = pwt->b[13]; + pwd->b[11] = pws->b[13]; + pwd->b[12] = pwt->b[14]; + pwd->b[13] = pws->b[14]; + pwd->b[14] = pwt->b[15]; + pwd->b[15] = pws->b[15]; +} + void helper_msa_sldi_df(CPUMIPSState *env, uint32_t df, uint32_t wd, uint32_t ws, uint32_t n) { diff --git a/target/mips/translate.c b/target/mips/translate.c index 95bcc65..ee60f29 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28885,6 +28885,95 @@ static void gen_msa_bit(CPUMIPSState *env, DisasContext *ctx) } /* + * [MSA] ILVL.H wd, ws, wt + * + * Vector Interleave Left (halfword data elements) + * + */ +static inline void gen_ilvl_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + uint64_t mask = 0x000000000000ffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask = 0x00000000ffff0000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask = 0x0000ffff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask = 0xffff000000000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVL.W wd, ws, wt + * + * Vector Interleave Left (word data elements) + * + */ +static inline void gen_ilvl_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + uint64_t mask = 0x00000000ffffffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask = 0xffffffff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVL.D wd, ws, wt + * + * Vector Interleave Left (doubleword data elements) + * + */ +static inline void gen_ilvl_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2 + 1]); + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); +} + +/* * [MSA] ILVOD. wd, ws, wt * * Vector Interleave Odd ( data elements) @@ -29148,7 +29237,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) gen_helper_msa_div_s_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVL_df: - gen_helper_msa_ilvl_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_helper_msa_ilvl_b(cpu_env, twd, tws, twt); + break; + case DF_HALF: + gen_ilvl_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvl_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvl_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; case OPC_BNEG_df: gen_helper_msa_bneg_df(cpu_env, tdf, twd, tws, twt); From patchwork Wed Apr 17 15:33:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mateja Marjanovic X-Patchwork-Id: 10905567 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 03171922 for ; Wed, 17 Apr 2019 15:36:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DBE80287AA for ; Wed, 17 Apr 2019 15:36:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CDA3228998; Wed, 17 Apr 2019 15:36:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2BA82287AA for ; Wed, 17 Apr 2019 15:36:07 +0000 (UTC) Received: from localhost ([127.0.0.1]:55462 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hGmbW-0002AY-BH for patchwork-qemu-devel@patchwork.kernel.org; Wed, 17 Apr 2019 11:36:06 -0400 Received: from eggs.gnu.org ([209.51.188.92]:34521) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hGmZB-0000L7-8Z for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hGmZ8-0004l0-9X for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:40 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:34571 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hGmZ7-0004jV-E4 for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:37 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 89C791A4132; Wed, 17 Apr 2019 17:33:34 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id 652BD1A39C9; Wed, 17 Apr 2019 17:33:34 +0200 (CEST) From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Wed, 17 Apr 2019 17:33:24 +0200 Message-Id: <1555515206-9352-5-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555515206-9352-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555515206-9352-1-git-send-email-mateja.marjanovic@rt-rk.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v7 4/6] target/mips: Optimize ILVR. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Mateja Marjanovic Optimize ILVR. instructions, using a hybrid approach. For byte data elements, use a helper with an unrolled loop (having much better performance than direct tcg translation), for halfword, word and doubleword data elements use directly tcg registers and logic performed on them. Performance measurement is done by executing the instructions 10 million times on a computer with Intel Core i7-3770 CPU @ 3.40GHz×8. ============================================================ ||instruction|| helper || tcg || hybrid || ============================================================ || ilvr.b || 62.87 ms || 74.76 ms || 61.49 ms (helper) || || ilvr.h || 44.11 ms || 33.00 ms || 33.69 ms (tcg) || || ilvr.w || 34.97 ms || 23.06 ms || 23.01 ms (tcg) || || ilvr.d || 27.33 ms || 19.87 ms || 19.65 ms (tcg) || ============================================================ Suggested-by: Aleksandar Markovic Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 2 +- target/mips/msa_helper.c | 33 +++++++++++---- target/mips/translate.c | 107 ++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 132 insertions(+), 10 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index f737a0e..4aee81c 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -862,7 +862,6 @@ DEF_HELPER_5(msa_sld_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_splat_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckev_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srlr_df, void, env, i32, i32, i32, i32) @@ -933,6 +932,7 @@ DEF_HELPER_4(msa_pcnt_df, void, env, i32, i32, i32) DEF_HELPER_4(msa_nloc_df, void, env, i32, i32, i32) DEF_HELPER_4(msa_nlzc_df, void, env, i32, i32, i32) +DEF_HELPER_4(msa_ilvr_b, void, env, i32, i32, i32) DEF_HELPER_4(msa_ilvl_b, void, env, i32, i32, i32) DEF_HELPER_4(msa_copy_s_b, void, env, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index 20ce447..fedd9b5 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1181,14 +1181,6 @@ MSA_FN_DF(pckev_df) } while (0) MSA_FN_DF(pckod_df) #undef MSA_DO - -#define MSA_DO(DF) \ - do { \ - pwx->DF[2*i] = R##DF(pwt, i); \ - pwx->DF[2*i+1] = R##DF(pws, i); \ - } while (0) -MSA_FN_DF(ilvr_df) -#undef MSA_DO #undef MSA_LOOP_COND #define MSA_LOOP_COND(DF) \ @@ -1206,6 +1198,31 @@ MSA_FN_DF(vshf_df) #undef MSA_LOOP_COND #undef MSA_FN_DF +void helper_msa_ilvr_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + wr_t *pwd = &(env->active_fpu.fpr[wd].wr); + wr_t *pws = &(env->active_fpu.fpr[ws].wr); + wr_t *pwt = &(env->active_fpu.fpr[wt].wr); + + pwd->b[15] = pws->b[7]; + pwd->b[14] = pwt->b[7]; + pwd->b[13] = pws->b[6]; + pwd->b[12] = pwt->b[6]; + pwd->b[11] = pws->b[5]; + pwd->b[10] = pwt->b[5]; + pwd->b[9] = pws->b[4]; + pwd->b[8] = pwt->b[4]; + pwd->b[7] = pws->b[3]; + pwd->b[6] = pwt->b[3]; + pwd->b[5] = pws->b[2]; + pwd->b[4] = pwt->b[2]; + pwd->b[3] = pws->b[1]; + pwd->b[2] = pwt->b[1]; + pwd->b[1] = pws->b[0]; + pwd->b[0] = pwt->b[0]; +} + void helper_msa_ilvl_b(CPUMIPSState *env, uint32_t wd, uint32_t ws, uint32_t wt) { diff --git a/target/mips/translate.c b/target/mips/translate.c index ee60f29..4c7b076 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28885,6 +28885,96 @@ static void gen_msa_bit(CPUMIPSState *env, DisasContext *ctx) } /* + * [MSA] ILVR.H wd, ws, wt + * + * Vector Interleave Right (halfword data elements) + * + */ +static inline void gen_ilvr_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + uint64_t mask = 0x000000000000ffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask = 0x00000000ffff0000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask = 0x0000ffff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask = 0xffff000000000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVR.W wd, ws, wt + * + * Vector Interleave Right (word data elements) + * + */ +static inline void gen_ilvr_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + uint64_t mask = 0x00000000ffffffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask = 0xffffffff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVR.D wd, ws, wt + * + * Vector Interleave Right (doubleword data elements) + * + */ +static inline void gen_ilvr_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]); + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]); +} + + +/* * [MSA] ILVL.H wd, ws, wt * * Vector Interleave Left (halfword data elements) @@ -29273,7 +29363,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) gen_helper_msa_div_u_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVR_df: - gen_helper_msa_ilvr_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_helper_msa_ilvr_b(cpu_env, twd, tws, twt); + break; + case DF_HALF: + gen_ilvr_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvr_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvr_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; case OPC_BINSL_df: gen_helper_msa_binsl_df(cpu_env, tdf, twd, tws, twt); From patchwork Wed Apr 17 15:33:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mateja Marjanovic X-Patchwork-Id: 10905577 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9A1F51515 for ; Wed, 17 Apr 2019 15:38:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7F7162873E for ; Wed, 17 Apr 2019 15:38:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 722442871F; Wed, 17 Apr 2019 15:38:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0AF6D2871F for ; Wed, 17 Apr 2019 15:38:51 +0000 (UTC) Received: from localhost ([127.0.0.1]:55486 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hGmeA-00047r-Bc for patchwork-qemu-devel@patchwork.kernel.org; Wed, 17 Apr 2019 11:38:50 -0400 Received: from eggs.gnu.org ([209.51.188.92]:34573) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hGmZD-0000Lp-6E for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hGmZB-0004n9-T8 for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:43 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:34601 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hGmZ9-0004lM-Ip for qemu-devel@nongnu.org; Wed, 17 Apr 2019 11:33:41 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id D7CD81A412E; Wed, 17 Apr 2019 17:33:34 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id 7CE8D1A2471; Wed, 17 Apr 2019 17:33:34 +0200 (CEST) From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Wed, 17 Apr 2019 17:33:26 +0200 Message-Id: <1555515206-9352-7-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555515206-9352-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555515206-9352-1-git-send-email-mateja.marjanovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v7 6/6] target/mips: Merge implementation of ILVOD.D and ILVL.D X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Mateja Marjanovic The implementation for ILVOD.D and ILVL.D instructions is equivalent, so use a single handler for both of them. Suggested-by: Aleksandar Markovic Signed-off-by: Mateja Marjanovic --- target/mips/translate.c | 27 ++++++++++----------------- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/target/mips/translate.c b/target/mips/translate.c index 656153a..14aaf98 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -29037,19 +29037,6 @@ static inline void gen_ilvl_w(CPUMIPSState *env, uint32_t wd, } /* - * [MSA] ILVL.D wd, ws, wt - * - * Vector Interleave Left (doubleword data elements) - * - */ -static inline void gen_ilvl_d(CPUMIPSState *env, uint32_t wd, - uint32_t ws, uint32_t wt) -{ - tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2 + 1]); - tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); -} - -/* * [MSA] ILVOD. wd, ws, wt * * Vector Interleave Odd ( data elements) @@ -29115,9 +29102,15 @@ static inline void gen_ilvod_w(CPUMIPSState *env, uint32_t wd, * * Vector Interleave Odd (doubleword data elements) * + * [MSA] ILVL.D wd, ws, wt + * + * Vector Interleave Left (doubleword data elements) + * + * These two instructions are functionally equivalent. + * */ -static inline void gen_ilvod_d(CPUMIPSState *env, uint32_t wd, - uint32_t ws, uint32_t wt) +static inline void gen_ilvod_ilvl_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) { tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2 + 1]); tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); @@ -29330,7 +29323,7 @@ static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) gen_ilvl_w(env, wd, ws, wt); break; case DF_DOUBLE: - gen_ilvl_d(env, wd, ws, wt); + gen_ilvod_ilvl_d(env, wd, ws, wt); break; default: assert(0); @@ -29426,7 +29419,7 @@ static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) gen_ilvod_w(env, wd, ws, wt); break; case DF_DOUBLE: - gen_ilvod_d(env, wd, ws, wt); + gen_ilvod_ilvl_d(env, wd, ws, wt); break; default: assert(0);