From patchwork Sun May 19 04:15:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 10949273 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8C27B912 for ; Sun, 19 May 2019 04:36:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7773C2856F for ; Sun, 19 May 2019 04:36:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 694FE2858E; Sun, 19 May 2019 04:36:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B20AD2856F for ; Sun, 19 May 2019 04:36:47 +0000 (UTC) Received: from localhost ([127.0.0.1]:43755 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDZ1-0002lp-3j for patchwork-qemu-devel@patchwork.kernel.org; Sun, 19 May 2019 00:36:47 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44218) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDSo-0005d1-TQ for qemu-devel@nongnu.org; Sun, 19 May 2019 00:30:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hSDEP-0004lA-1y for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:32 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:34025) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hSDEO-0004kA-6c for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:28 -0400 Received: by mail-pg1-x542.google.com with SMTP id c13so5162617pgt.1 for ; Sat, 18 May 2019 21:15:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ivzXz7WaxGAoU4CgTEArJPFa5XQwFPUyGBuwDQJ+uGA=; b=Eigezh4Xyk+wzecWdqALTm8sZPUZkU5XcKRyrE/rUDSXlBkIWcJgs71alG+n6V3nbb KzxPJvCmvomAd0+dPtXE9nvTAUlbfGD31HFIJ2aPqvHsLwCml9OGOu3N7Bfj46zev+mp JrJAGwUZ960kXhQC2Od1dXuvxHWnNDUnJVT0MjLzWj6sVGKCMjjaFI5ts91QAkoXKv0Y Bs9rAeEJNY729hOA4uxHmdMBRyVqFSg1dHFDsDkT4HSycGed5R//o1LSptoAyOgh1k/i LKTXTL2rXkYEY/14P7U4XfwKkC41jO2wCRQfLLwVZpgrbWz0kIOO7Myp4ljDjb0z7msr anEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ivzXz7WaxGAoU4CgTEArJPFa5XQwFPUyGBuwDQJ+uGA=; b=D+IHs8xM/thcJduVQ87uP6S7nWBnKb2TpETS6uADmOMwTMZiSnVr09QLpiphqFWQIg ugbDGX81O2F9vmF5vyujhxjpwUTMb1b3aReFwOC/7MBipI3WmwBiuSuPb9sMEU9p8CVM mU/5tZrKu31+qMsMJhj79vMEonEGuY6OJ5kO9UwPUAEzI3T1v0els5etqJiYL0ZoCxiA z3rQD2eDYgaJD3y2G/D21/ySHRxvAB08RnkGTxnHtQVfOXpkseCoF5B7Qw9PrgRkAwAM GMUZyHVgn9Cx1NHWym8lAiYeOUcbITMywI0mqhjkw7Pp4/3IREmlCzYYBuOMX8IA2kZk 5YyA== X-Gm-Message-State: APjAAAWAk0iRj906/Xt5qr12I4TGxBXzJKetMsl7o6/rGexDvocNtvyY xHsOkjsUftTxXfLOgFQmR1+0eWCpZ1s= X-Google-Smtp-Source: APXvYqy+3kmX8DlTm4UbNwYNbOPieUkaPkfa1Np/Ril6djGTn+dN9IQk0mEuRg43nesPiBvUnC50hA== X-Received: by 2002:aa7:8dc3:: with SMTP id j3mr72397650pfr.141.1558239326259; Sat, 18 May 2019 21:15:26 -0700 (PDT) Received: from localhost.localdomain (97-113-13-231.tukw.qwest.net. [97.113.13.231]) by smtp.gmail.com with ESMTPSA id k63sm19381200pfb.108.2019.05.18.21.15.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 18 May 2019 21:15:25 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 18 May 2019 21:15:16 -0700 Message-Id: <20190519041522.12327-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190519041522.12327-1-richard.henderson@linaro.org> References: <20190519041522.12327-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::542 Subject: [Qemu-devel] [PATCH v4 1/7] tcg/ppc: Initial backend support for Altivec X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.cave-ayland@ilande.co.uk Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP There are a few missing operations yet, like expansion of multiply and shifts. But this has move, load, store, and basic arithmetic. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.h | 36 +- tcg/ppc/tcg-target.opc.h | 3 + tcg/ppc/tcg-target.inc.c | 707 +++++++++++++++++++++++++++++++++++---- 3 files changed, 685 insertions(+), 61 deletions(-) create mode 100644 tcg/ppc/tcg-target.opc.h diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h index 7627fb62d3..368c250c6a 100644 --- a/tcg/ppc/tcg-target.h +++ b/tcg/ppc/tcg-target.h @@ -31,7 +31,7 @@ # define TCG_TARGET_REG_BITS 32 #endif -#define TCG_TARGET_NB_REGS 32 +#define TCG_TARGET_NB_REGS 64 #define TCG_TARGET_INSN_UNIT_SIZE 4 #define TCG_TARGET_TLB_DISPLACEMENT_BITS 16 @@ -45,10 +45,20 @@ typedef enum { TCG_REG_R24, TCG_REG_R25, TCG_REG_R26, TCG_REG_R27, TCG_REG_R28, TCG_REG_R29, TCG_REG_R30, TCG_REG_R31, + TCG_REG_V0, TCG_REG_V1, TCG_REG_V2, TCG_REG_V3, + TCG_REG_V4, TCG_REG_V5, TCG_REG_V6, TCG_REG_V7, + TCG_REG_V8, TCG_REG_V9, TCG_REG_V10, TCG_REG_V11, + TCG_REG_V12, TCG_REG_V13, TCG_REG_V14, TCG_REG_V15, + TCG_REG_V16, TCG_REG_V17, TCG_REG_V18, TCG_REG_V19, + TCG_REG_V20, TCG_REG_V21, TCG_REG_V22, TCG_REG_V23, + TCG_REG_V24, TCG_REG_V25, TCG_REG_V26, TCG_REG_V27, + TCG_REG_V28, TCG_REG_V29, TCG_REG_V30, TCG_REG_V31, + TCG_REG_CALL_STACK = TCG_REG_R1, TCG_AREG0 = TCG_REG_R27 } TCGReg; +extern bool have_isa_altivec; extern bool have_isa_2_06; extern bool have_isa_3_00; @@ -126,6 +136,30 @@ extern bool have_isa_3_00; #define TCG_TARGET_HAS_mulsh_i64 1 #endif +/* + * While technically Altivec could support V64, it has no 64-bit store + * instruction and substituting two 32-bit stores makes the generated + * code quite large. + */ +#define TCG_TARGET_HAS_v64 0 +#define TCG_TARGET_HAS_v128 have_isa_altivec +#define TCG_TARGET_HAS_v256 0 + +#define TCG_TARGET_HAS_andc_vec 1 +#define TCG_TARGET_HAS_orc_vec 0 +#define TCG_TARGET_HAS_not_vec 1 +#define TCG_TARGET_HAS_neg_vec 0 +#define TCG_TARGET_HAS_abs_vec 0 +#define TCG_TARGET_HAS_shi_vec 0 +#define TCG_TARGET_HAS_shs_vec 0 +#define TCG_TARGET_HAS_shv_vec 0 +#define TCG_TARGET_HAS_cmp_vec 1 +#define TCG_TARGET_HAS_mul_vec 0 +#define TCG_TARGET_HAS_sat_vec 1 +#define TCG_TARGET_HAS_minmax_vec 1 +#define TCG_TARGET_HAS_bitsel_vec 0 +#define TCG_TARGET_HAS_cmpsel_vec 0 + void flush_icache_range(uintptr_t start, uintptr_t stop); void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t); diff --git a/tcg/ppc/tcg-target.opc.h b/tcg/ppc/tcg-target.opc.h new file mode 100644 index 0000000000..4816a6c3d4 --- /dev/null +++ b/tcg/ppc/tcg-target.opc.h @@ -0,0 +1,3 @@ +/* Target-specific opcodes for host vector expansion. These will be + emitted by tcg_expand_vec_op. For those familiar with GCC internals, + consider these to be UNSPEC with names. */ diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c index 30c095d3d5..479e653da6 100644 --- a/tcg/ppc/tcg-target.inc.c +++ b/tcg/ppc/tcg-target.inc.c @@ -42,6 +42,9 @@ # define TCG_REG_TMP1 TCG_REG_R12 #endif +#define TCG_VEC_TMP1 TCG_REG_V0 +#define TCG_VEC_TMP2 TCG_REG_V1 + #define TCG_REG_TB TCG_REG_R31 #define USE_REG_TB (TCG_TARGET_REG_BITS == 64) @@ -61,6 +64,7 @@ static tcg_insn_unit *tb_ret_addr; +bool have_isa_altivec; bool have_isa_2_06; bool have_isa_3_00; @@ -72,39 +76,15 @@ bool have_isa_3_00; #endif #ifdef CONFIG_DEBUG_TCG -static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = { - "r0", - "r1", - "r2", - "r3", - "r4", - "r5", - "r6", - "r7", - "r8", - "r9", - "r10", - "r11", - "r12", - "r13", - "r14", - "r15", - "r16", - "r17", - "r18", - "r19", - "r20", - "r21", - "r22", - "r23", - "r24", - "r25", - "r26", - "r27", - "r28", - "r29", - "r30", - "r31" +static const char tcg_target_reg_names[TCG_TARGET_NB_REGS][4] = { + "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", + "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15", + "r16", "r17", "r18", "r19", "r20", "r21", "r22", "r23", + "r24", "r25", "r26", "r27", "r28", "r29", "r30", "r31", + "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", + "v8", "v9", "v10", "v11", "v12", "v13", "v14", "v15", + "v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23", + "v24", "v25", "v26", "v27", "v28", "v29", "v30", "v31", }; #endif @@ -139,6 +119,26 @@ static const int tcg_target_reg_alloc_order[] = { TCG_REG_R5, TCG_REG_R4, TCG_REG_R3, + + /* V0 and V1 reserved as temporaries; V20 - V31 are call-saved */ + TCG_REG_V2, /* call clobbered, vectors */ + TCG_REG_V3, + TCG_REG_V4, + TCG_REG_V5, + TCG_REG_V6, + TCG_REG_V7, + TCG_REG_V8, + TCG_REG_V9, + TCG_REG_V10, + TCG_REG_V11, + TCG_REG_V12, + TCG_REG_V13, + TCG_REG_V14, + TCG_REG_V15, + TCG_REG_V16, + TCG_REG_V17, + TCG_REG_V18, + TCG_REG_V19, }; static const int tcg_target_call_iarg_regs[] = { @@ -233,6 +233,10 @@ static const char *target_parse_constraint(TCGArgConstraint *ct, ct->ct |= TCG_CT_REG; ct->u.regs = 0xffffffff; break; + case 'v': + ct->ct |= TCG_CT_REG; + ct->u.regs = 0xffffffff00000000ull; + break; case 'L': /* qemu_ld constraint */ ct->ct |= TCG_CT_REG; ct->u.regs = 0xffffffff; @@ -320,6 +324,7 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define XO31(opc) (OPCD(31)|((opc)<<1)) #define XO58(opc) (OPCD(58)|(opc)) #define XO62(opc) (OPCD(62)|(opc)) +#define VX4(opc) (OPCD(4)|(opc)) #define B OPCD( 18) #define BC OPCD( 16) @@ -461,6 +466,72 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define NOP ORI /* ori 0,0,0 */ +#define LVX XO31(103) +#define LVEBX XO31(7) +#define LVEHX XO31(39) +#define LVEWX XO31(71) + +#define STVX XO31(231) +#define STVEWX XO31(199) + +#define VADDSBS VX4(768) +#define VADDUBS VX4(512) +#define VADDUBM VX4(0) +#define VADDSHS VX4(832) +#define VADDUHS VX4(576) +#define VADDUHM VX4(64) +#define VADDSWS VX4(896) +#define VADDUWS VX4(640) +#define VADDUWM VX4(128) + +#define VSUBSBS VX4(1792) +#define VSUBUBS VX4(1536) +#define VSUBUBM VX4(1024) +#define VSUBSHS VX4(1856) +#define VSUBUHS VX4(1600) +#define VSUBUHM VX4(1088) +#define VSUBSWS VX4(1920) +#define VSUBUWS VX4(1664) +#define VSUBUWM VX4(1152) + +#define VMAXSB VX4(258) +#define VMAXSH VX4(322) +#define VMAXSW VX4(386) +#define VMAXUB VX4(2) +#define VMAXUH VX4(66) +#define VMAXUW VX4(130) +#define VMINSB VX4(770) +#define VMINSH VX4(834) +#define VMINSW VX4(898) +#define VMINUB VX4(514) +#define VMINUH VX4(578) +#define VMINUW VX4(642) + +#define VCMPEQUB VX4(6) +#define VCMPEQUH VX4(70) +#define VCMPEQUW VX4(134) +#define VCMPGTSB VX4(774) +#define VCMPGTSH VX4(838) +#define VCMPGTSW VX4(902) +#define VCMPGTUB VX4(518) +#define VCMPGTUH VX4(582) +#define VCMPGTUW VX4(646) + +#define VAND VX4(1028) +#define VANDC VX4(1092) +#define VNOR VX4(1284) +#define VOR VX4(1156) +#define VXOR VX4(1220) + +#define VSPLTB VX4(524) +#define VSPLTH VX4(588) +#define VSPLTW VX4(652) +#define VSPLTISB VX4(780) +#define VSPLTISH VX4(844) +#define VSPLTISW VX4(908) + +#define VSLDOI VX4(44) + #define RT(r) ((r)<<21) #define RS(r) ((r)<<21) #define RA(r) ((r)<<16) @@ -473,6 +544,11 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define MB64(b) ((b)<<5) #define FXM(b) (1 << (19 - (b))) +#define VRT(r) (((r) & 31) << 21) +#define VRA(r) (((r) & 31) << 16) +#define VRB(r) (((r) & 31) << 11) +#define VRC(r) (((r) & 31) << 6) + #define LK 1 #define TAB(t, a, b) (RT(t) | RA(a) | RB(b)) @@ -529,6 +605,8 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type, intptr_t value, intptr_t addend) { tcg_insn_unit *target; + int16_t lo; + int32_t hi; value += addend; target = (tcg_insn_unit *)value; @@ -550,6 +628,20 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type, } *code_ptr = (*code_ptr & ~0xfffc) | (value & 0xfffc); break; + case R_PPC_ADDR32: + /* + * We are abusing this relocation type. Again, this points to + * a pair of insns, lis + load. This is an absolute address + * relocation for PPC32 so the lis cannot be removed. + */ + lo = value; + hi = value - lo; + if (hi + lo != value) { + return false; + } + code_ptr[0] = deposit32(code_ptr[0], 0, 16, hi >> 16); + code_ptr[1] = deposit32(code_ptr[1], 0, 16, lo); + break; default: g_assert_not_reached(); } @@ -561,9 +653,29 @@ static void tcg_out_mem_long(TCGContext *s, int opi, int opx, TCGReg rt, static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) { - tcg_debug_assert(TCG_TARGET_REG_BITS == 64 || type == TCG_TYPE_I32); - if (ret != arg) { - tcg_out32(s, OR | SAB(arg, ret, arg)); + if (ret == arg) { + return true; + } + switch (type) { + case TCG_TYPE_I64: + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + /* fallthru */ + case TCG_TYPE_I32: + if (ret < 32 && arg < 32) { + tcg_out32(s, OR | SAB(arg, ret, arg)); + break; + } else if (ret < 32 || arg < 32) { + /* Altivec does not support vector/integer moves. */ + return false; + } + /* fallthru */ + case TCG_TYPE_V64: + case TCG_TYPE_V128: + tcg_debug_assert(ret >= 32 && arg >= 32); + tcg_out32(s, VOR | VRT(ret) | VRA(arg) | VRB(arg)); + break; + default: + g_assert_not_reached(); } return true; } @@ -712,10 +824,76 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, TCGReg ret, } } -static inline void tcg_out_movi(TCGContext *s, TCGType type, TCGReg ret, - tcg_target_long arg) +static void tcg_out_dupi_vec(TCGContext *s, TCGType type, TCGReg ret, + tcg_target_long val) { - tcg_out_movi_int(s, type, ret, arg, false); + uint32_t load_insn; + int rel, low; + intptr_t add; + + low = (int8_t)val; + if (low >= -16 && low < 16) { + if (val == (tcg_target_long)dup_const(MO_8, low)) { + tcg_out32(s, VSPLTISB | VRT(ret) | ((val & 31) << 16)); + return; + } + if (val == (tcg_target_long)dup_const(MO_16, low)) { + tcg_out32(s, VSPLTISH | VRT(ret) | ((val & 31) << 16)); + return; + } + if (val == (tcg_target_long)dup_const(MO_32, low)) { + tcg_out32(s, VSPLTISW | VRT(ret) | ((val & 31) << 16)); + return; + } + } + + /* + * Otherwise we must load the value from the constant pool. + */ + if (USE_REG_TB) { + rel = R_PPC_ADDR16; + add = -(intptr_t)s->code_gen_ptr; + } else { + rel = R_PPC_ADDR32; + add = 0; + } + + load_insn = LVX | VRT(ret) | RB(TCG_REG_TMP1); + if (TCG_TARGET_REG_BITS == 64) { + new_pool_l2(s, rel, s->code_ptr, add, val, val); + } else { + new_pool_l4(s, rel, s->code_ptr, add, val, val, val, val); + } + + if (USE_REG_TB) { + tcg_out32(s, ADDI | TAI(TCG_REG_TMP1, 0, 0)); + load_insn |= RA(TCG_REG_TB); + } else { + tcg_out32(s, ADDIS | TAI(TCG_REG_TMP1, 0, 0)); + tcg_out32(s, ADDI | TAI(TCG_REG_TMP1, TCG_REG_TMP1, 0)); + } + tcg_out32(s, load_insn); +} + +static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg ret, + tcg_target_long arg) +{ + switch (type) { + case TCG_TYPE_I32: + case TCG_TYPE_I64: + tcg_debug_assert(ret < 32); + tcg_out_movi_int(s, type, ret, arg, false); + break; + + case TCG_TYPE_V64: + case TCG_TYPE_V128: + tcg_debug_assert(ret >= 32); + tcg_out_dupi_vec(s, type, ret, arg); + break; + + default: + g_assert_not_reached(); + } } static bool mask_operand(uint32_t c, int *mb, int *me) @@ -868,7 +1046,7 @@ static void tcg_out_mem_long(TCGContext *s, int opi, int opx, TCGReg rt, } /* For unaligned, or very large offsets, use the indexed form. */ - if (offset & align || offset != (int32_t)offset) { + if (offset & align || offset != (int32_t)offset || opi == 0) { if (rs == base) { rs = TCG_REG_R0; } @@ -899,32 +1077,96 @@ static void tcg_out_mem_long(TCGContext *s, int opi, int opx, TCGReg rt, } } -static inline void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, - TCGReg arg1, intptr_t arg2) +static void tcg_out_vsldoi(TCGContext *s, TCGReg ret, + TCGReg va, TCGReg vb, int shb) { - int opi, opx; - - tcg_debug_assert(TCG_TARGET_REG_BITS == 64 || type == TCG_TYPE_I32); - if (type == TCG_TYPE_I32) { - opi = LWZ, opx = LWZX; - } else { - opi = LD, opx = LDX; - } - tcg_out_mem_long(s, opi, opx, ret, arg1, arg2); + tcg_out32(s, VSLDOI | VRT(ret) | VRA(va) | VRB(vb) | (shb << 6)); } -static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg, - TCGReg arg1, intptr_t arg2) +static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, + TCGReg base, intptr_t offset) { - int opi, opx; + int shift; - tcg_debug_assert(TCG_TARGET_REG_BITS == 64 || type == TCG_TYPE_I32); - if (type == TCG_TYPE_I32) { - opi = STW, opx = STWX; - } else { - opi = STD, opx = STDX; + switch (type) { + case TCG_TYPE_I32: + if (ret < 32) { + tcg_out_mem_long(s, LWZ, LWZX, ret, base, offset); + break; + } + assert((offset & 3) == 0); + tcg_out_mem_long(s, 0, LVEWX, ret & 31, base, offset); + shift = (offset - 4) & 0xc; + if (shift) { + tcg_out_vsldoi(s, ret, ret, ret, shift); + } + break; + case TCG_TYPE_I64: + if (ret < 32) { + tcg_out_mem_long(s, LD, LDX, ret, base, offset); + break; + } + /* fallthru */ + case TCG_TYPE_V64: + tcg_debug_assert(ret >= 32); + assert((offset & 7) == 0); + tcg_out_mem_long(s, 0, LVX, ret & 31, base, offset & -16); + if (offset & 8) { + tcg_out_vsldoi(s, ret, ret, ret, 8); + } + break; + case TCG_TYPE_V128: + tcg_debug_assert(ret >= 32); + assert((offset & 15) == 0); + tcg_out_mem_long(s, 0, LVX, ret & 31, base, offset); + break; + default: + g_assert_not_reached(); + } +} + +static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg, + TCGReg base, intptr_t offset) +{ + int shift; + + switch (type) { + case TCG_TYPE_I32: + if (arg < 32) { + tcg_out_mem_long(s, STW, STWX, arg, base, offset); + break; + } + assert((offset & 3) == 0); + shift = (offset - 4) & 0xc; + if (shift) { + tcg_out_vsldoi(s, TCG_VEC_TMP1, arg, arg, shift); + arg = TCG_VEC_TMP1; + } + tcg_out_mem_long(s, 0, STVEWX, arg & 31, base, offset); + break; + case TCG_TYPE_I64: + if (arg < 32) { + tcg_out_mem_long(s, STD, STDX, arg, base, offset); + break; + } + /* fallthru */ + case TCG_TYPE_V64: + tcg_debug_assert(arg >= 32); + assert((offset & 7) == 0); + if (offset & 8) { + tcg_out_vsldoi(s, TCG_VEC_TMP1, arg, arg, 8); + arg = TCG_VEC_TMP1; + } + tcg_out_mem_long(s, 0, STVEWX, arg & 31, base, offset); + tcg_out_mem_long(s, 0, STVEWX, arg & 31, base, offset + 4); + break; + case TCG_TYPE_V128: + tcg_debug_assert(arg >= 32); + tcg_out_mem_long(s, 0, STVX, arg & 31, base, offset); + break; + default: + g_assert_not_reached(); } - tcg_out_mem_long(s, opi, opx, arg, arg1, arg2); } static inline bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val, @@ -2616,6 +2858,292 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, } } +int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) +{ + switch (opc) { + case INDEX_op_and_vec: + case INDEX_op_or_vec: + case INDEX_op_xor_vec: + case INDEX_op_andc_vec: + case INDEX_op_not_vec: + return 1; + case INDEX_op_add_vec: + case INDEX_op_sub_vec: + case INDEX_op_smax_vec: + case INDEX_op_smin_vec: + case INDEX_op_umax_vec: + case INDEX_op_umin_vec: + case INDEX_op_ssadd_vec: + case INDEX_op_sssub_vec: + case INDEX_op_usadd_vec: + case INDEX_op_ussub_vec: + return vece <= MO_32; + case INDEX_op_cmp_vec: + return vece <= MO_32 ? -1 : 0; + default: + return 0; + } +} + +static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg dst, TCGReg src) +{ + tcg_debug_assert(dst >= 32); + tcg_debug_assert(src >= 32); + + /* + * Recall we use (or emulate) VSX integer loads, so the integer is + * right justified within the left (zero-index) double-word. + */ + switch (vece) { + case MO_8: + tcg_out32(s, VSPLTB | VRT(dst) | VRB(src) | (7 << 16)); + break; + case MO_16: + tcg_out32(s, VSPLTH | VRT(dst) | VRB(src) | (3 << 16)); + break; + case MO_32: + tcg_out32(s, VSPLTW | VRT(dst) | VRB(src) | (1 << 16)); + break; + case MO_64: + tcg_out_vsldoi(s, TCG_VEC_TMP1, src, src, 8); + tcg_out_vsldoi(s, dst, TCG_VEC_TMP1, src, 8); + break; + default: + g_assert_not_reached(); + } + return true; +} + +static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg out, TCGReg base, intptr_t offset) +{ + int elt; + + tcg_debug_assert(out >= 32); + out &= 31; + switch (vece) { + case MO_8: + tcg_out_mem_long(s, 0, LVEBX, out, base, offset); + elt = extract32(offset, 0, 4); +#ifndef HOST_WORDS_BIGENDIAN + elt ^= 15; +#endif + tcg_out32(s, VSPLTB | VRT(out) | VRB(out) | (elt << 16)); + break; + case MO_16: + assert((offset & 1) == 0); + tcg_out_mem_long(s, 0, LVEHX, out, base, offset); + elt = extract32(offset, 1, 3); +#ifndef HOST_WORDS_BIGENDIAN + elt ^= 7; +#endif + tcg_out32(s, VSPLTH | VRT(out) | VRB(out) | (elt << 16)); + break; + case MO_32: + assert((offset & 3) == 0); + tcg_out_mem_long(s, 0, LVEWX, out, base, offset); + elt = extract32(offset, 2, 2); +#ifndef HOST_WORDS_BIGENDIAN + elt ^= 3; +#endif + tcg_out32(s, VSPLTW | VRT(out) | VRB(out) | (elt << 16)); + break; + case MO_64: + assert((offset & 7) == 0); + tcg_out_mem_long(s, 0, LVX, out, base, offset & -16); + tcg_out_vsldoi(s, TCG_VEC_TMP1, out, out, 8); + elt = extract32(offset, 3, 1); +#ifndef HOST_WORDS_BIGENDIAN + elt = !elt; +#endif + if (elt) { + tcg_out_vsldoi(s, out, out, TCG_VEC_TMP1, 8); + } else { + tcg_out_vsldoi(s, out, TCG_VEC_TMP1, out, 8); + } + break; + default: + g_assert_not_reached(); + } + return true; +} + +static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, + unsigned vecl, unsigned vece, + const TCGArg *args, const int *const_args) +{ + static const uint32_t + add_op[4] = { VADDUBM, VADDUHM, VADDUWM, 0 }, + sub_op[4] = { VSUBUBM, VSUBUHM, VSUBUWM, 0 }, + eq_op[4] = { VCMPEQUB, VCMPEQUH, VCMPEQUW, 0 }, + gts_op[4] = { VCMPGTSB, VCMPGTSH, VCMPGTSW, 0 }, + gtu_op[4] = { VCMPGTUB, VCMPGTUH, VCMPGTUW, 0 }, + ssadd_op[4] = { VADDSBS, VADDSHS, VADDSWS, 0 }, + usadd_op[4] = { VADDUBS, VADDUHS, VADDUWS, 0 }, + sssub_op[4] = { VSUBSBS, VSUBSHS, VSUBSWS, 0 }, + ussub_op[4] = { VSUBUBS, VSUBUHS, VSUBUWS, 0 }, + umin_op[4] = { VMINUB, VMINUH, VMINUW, 0 }, + smin_op[4] = { VMINSB, VMINSH, VMINSW, 0 }, + umax_op[4] = { VMAXUB, VMAXUH, VMAXUW, 0 }, + smax_op[4] = { VMAXSB, VMAXSH, VMAXSW, 0 }; + + TCGType type = vecl + TCG_TYPE_V64; + TCGArg a0 = args[0], a1 = args[1], a2 = args[2]; + uint32_t insn; + + switch (opc) { + case INDEX_op_ld_vec: + tcg_out_ld(s, type, a0, a1, a2); + return; + case INDEX_op_st_vec: + tcg_out_st(s, type, a0, a1, a2); + return; + case INDEX_op_dupm_vec: + tcg_out_dupm_vec(s, type, vece, a0, a1, a2); + return; + + case INDEX_op_add_vec: + insn = add_op[vece]; + break; + case INDEX_op_sub_vec: + insn = sub_op[vece]; + break; + case INDEX_op_ssadd_vec: + insn = ssadd_op[vece]; + break; + case INDEX_op_sssub_vec: + insn = sssub_op[vece]; + break; + case INDEX_op_usadd_vec: + insn = usadd_op[vece]; + break; + case INDEX_op_ussub_vec: + insn = ussub_op[vece]; + break; + case INDEX_op_smin_vec: + insn = smin_op[vece]; + break; + case INDEX_op_umin_vec: + insn = umin_op[vece]; + break; + case INDEX_op_smax_vec: + insn = smax_op[vece]; + break; + case INDEX_op_umax_vec: + insn = umax_op[vece]; + break; + case INDEX_op_and_vec: + insn = VAND; + break; + case INDEX_op_or_vec: + insn = VOR; + break; + case INDEX_op_xor_vec: + insn = VXOR; + break; + case INDEX_op_andc_vec: + insn = VANDC; + break; + case INDEX_op_not_vec: + insn = VNOR; + a2 = a1; + break; + + case INDEX_op_cmp_vec: + switch (args[3]) { + case TCG_COND_EQ: + insn = eq_op[vece]; + break; + case TCG_COND_GT: + insn = gts_op[vece]; + break; + case TCG_COND_GTU: + insn = gtu_op[vece]; + break; + default: + g_assert_not_reached(); + } + break; + + case INDEX_op_mov_vec: /* Always emitted via tcg_out_mov. */ + case INDEX_op_dupi_vec: /* Always emitted via tcg_out_movi. */ + case INDEX_op_dup_vec: /* Always emitted via tcg_out_dup_vec. */ + default: + g_assert_not_reached(); + } + + tcg_debug_assert(insn != 0); + tcg_out32(s, insn | VRT(a0) | VRA(a1) | VRB(a2)); +} + +static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, + TCGv_vec v1, TCGv_vec v2, TCGCond cond) +{ + bool need_swap = false, need_inv = false; + + tcg_debug_assert(vece <= MO_32); + + switch (cond) { + case TCG_COND_EQ: + case TCG_COND_GT: + case TCG_COND_GTU: + break; + case TCG_COND_NE: + case TCG_COND_LE: + case TCG_COND_LEU: + need_inv = true; + break; + case TCG_COND_LT: + case TCG_COND_LTU: + need_swap = true; + break; + case TCG_COND_GE: + case TCG_COND_GEU: + need_swap = need_inv = true; + break; + default: + g_assert_not_reached(); + } + + if (need_inv) { + cond = tcg_invert_cond(cond); + } + if (need_swap) { + TCGv_vec t1; + t1 = v1, v1 = v2, v2 = t1; + cond = tcg_swap_cond(cond); + } + + vec_gen_4(INDEX_op_cmp_vec, type, vece, tcgv_vec_arg(v0), + tcgv_vec_arg(v1), tcgv_vec_arg(v2), cond); + + if (need_inv) { + tcg_gen_not_vec(vece, v0, v0); + } +} + +void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, + TCGArg a0, ...) +{ + va_list va; + TCGv_vec v0, v1, v2; + + va_start(va, a0); + v0 = temp_tcgv_vec(arg_temp(a0)); + v1 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); + v2 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); + + switch (opc) { + case INDEX_op_cmp_vec: + expand_vec_cmp(type, vece, v0, v1, v2, va_arg(va, TCGArg)); + break; + default: + g_assert_not_reached(); + } + va_end(va); +} + static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) { static const TCGTargetOpDef r = { .args_ct_str = { "r" } }; @@ -2653,6 +3181,9 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) = { .args_ct_str = { "r", "r", "r", "r", "rI", "rZM" } }; static const TCGTargetOpDef sub2 = { .args_ct_str = { "r", "r", "rI", "rZM", "r", "r" } }; + static const TCGTargetOpDef v_r = { .args_ct_str = { "v", "r" } }; + static const TCGTargetOpDef v_v = { .args_ct_str = { "v", "v" } }; + static const TCGTargetOpDef v_v_v = { .args_ct_str = { "v", "v", "v" } }; switch (op) { case INDEX_op_goto_ptr: @@ -2788,6 +3319,32 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) return (TCG_TARGET_REG_BITS == 64 ? &S_S : TARGET_LONG_BITS == 32 ? &S_S_S : &S_S_S_S); + case INDEX_op_add_vec: + case INDEX_op_sub_vec: + case INDEX_op_mul_vec: + case INDEX_op_and_vec: + case INDEX_op_or_vec: + case INDEX_op_xor_vec: + case INDEX_op_andc_vec: + case INDEX_op_orc_vec: + case INDEX_op_cmp_vec: + case INDEX_op_ssadd_vec: + case INDEX_op_sssub_vec: + case INDEX_op_usadd_vec: + case INDEX_op_ussub_vec: + case INDEX_op_smax_vec: + case INDEX_op_smin_vec: + case INDEX_op_umax_vec: + case INDEX_op_umin_vec: + return &v_v_v; + case INDEX_op_not_vec: + case INDEX_op_dup_vec: + return &v_v; + case INDEX_op_ld_vec: + case INDEX_op_st_vec: + case INDEX_op_dupm_vec: + return &v_r; + default: return NULL; } @@ -2798,6 +3355,9 @@ static void tcg_target_init(TCGContext *s) unsigned long hwcap = qemu_getauxval(AT_HWCAP); unsigned long hwcap2 = qemu_getauxval(AT_HWCAP2); + if (hwcap & PPC_FEATURE_HAS_ALTIVEC) { + have_isa_altivec = true; + } if (hwcap & PPC_FEATURE_ARCH_2_06) { have_isa_2_06 = true; } @@ -2809,6 +3369,10 @@ static void tcg_target_init(TCGContext *s) tcg_target_available_regs[TCG_TYPE_I32] = 0xffffffff; tcg_target_available_regs[TCG_TYPE_I64] = 0xffffffff; + if (have_isa_altivec) { + tcg_target_available_regs[TCG_TYPE_V64] = 0xffffffff00000000ull; + tcg_target_available_regs[TCG_TYPE_V128] = 0xffffffff00000000ull; + } tcg_target_call_clobber_regs = 0; tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R0); @@ -2824,6 +3388,27 @@ static void tcg_target_init(TCGContext *s) tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R11); tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R12); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V0); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V1); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V2); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V3); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V4); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V5); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V6); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V7); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V8); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V9); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V10); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V11); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V12); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V13); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V14); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V15); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V16); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V17); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V18); + tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_V19); + s->reserved_regs = 0; tcg_regset_set_reg(s->reserved_regs, TCG_REG_R0); /* tcg temp */ tcg_regset_set_reg(s->reserved_regs, TCG_REG_R1); /* stack pointer */ @@ -2834,6 +3419,8 @@ static void tcg_target_init(TCGContext *s) tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13); /* thread pointer */ #endif tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP1); /* mem temp */ + tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP1); + tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP2); if (USE_REG_TB) { tcg_regset_set_reg(s->reserved_regs, TCG_REG_TB); /* tb->tc_ptr */ } From patchwork Sun May 19 04:15:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 10949275 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A040776 for ; Sun, 19 May 2019 04:37:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8BADF2856F for ; Sun, 19 May 2019 04:37:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7C5ED2858E; Sun, 19 May 2019 04:37:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E90C62856F for ; Sun, 19 May 2019 04:37:52 +0000 (UTC) Received: from localhost ([127.0.0.1]:43765 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDa4-0003Uu-8l for patchwork-qemu-devel@patchwork.kernel.org; Sun, 19 May 2019 00:37:52 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44461) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDSr-0005xg-FV for qemu-devel@nongnu.org; Sun, 19 May 2019 00:30:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hSDEP-0004lI-87 for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:30 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:35719) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hSDEP-0004kZ-2L for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:29 -0400 Received: by mail-pg1-x542.google.com with SMTP id t1so3735626pgc.2 for ; Sat, 18 May 2019 21:15:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=YfGVkAkAdchVmsbR1Bbtd6g0EjAyZZwmv0vuRe19lD0=; b=FGru2E7SYc23+VWoUthA8xDmW9nmauqdoaPvE9XDBBnGJW2+5oyUf3070ukwfKlsXb S9WzNoEsPzG1LNyfUWOd5nFRJ73GcSIKg9hETmdpodSNseE4CKCGsuRKLxxyfH2//zFg yUqZA5I+axRfaIwgVZYoZLksPXuIPx4iQA8oMKyDsPjG6BNANVpKNG1Ny0kM2xTZUTZ8 ciAPNHKCgiPdcHeCPRB7ey9ftoTXD9CumlhVqlx5O3WBPay7WUCBzQjd9Nsjiluihvux UXkIOKv/PVOAuvp7+ieAU/F/MSMAo+q3sKEd9EuK0Q0k3zU+VCliD1vCYt6KURKMOUKO AJOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=YfGVkAkAdchVmsbR1Bbtd6g0EjAyZZwmv0vuRe19lD0=; b=razi8M7kqWpzMwNT8UoR1fwUFNIx8DOhlotQEtgvO0xQTNfly6/ejTSrlqZBbU6+iQ cHIB4xw8WGQ5ueZ5YuXk9BZiOelHhpYgNl7ZMEluP1n/fOm/iPZ6zu0ZkJSFeReLcoxd iRTxkBSwjLYielE83C3wvzXImbp+OpCYhSUHfqCIkqUsYnwr7FnNtnQdzC71frJse/QD wNj+tX6Nbv2WmG4em+V7w3jRDOmqEagWonODMsboWLRzbNuJwsKLTJBXWcR8mAf3YSM0 iAKXMtugW7VhsH+MpnGoY9gls26WIRiSYWOHzyALJBTPwpznf4qFnMQfnQObUR2pBO37 S+PA== X-Gm-Message-State: APjAAAUNHB0xlaRLewpvYI/9DnS0DBW+6Kgd3CCtZIC1EPrMFqj+ukrS 6bjpTlqGazaVZwEAf09NreMI/arMc6s= X-Google-Smtp-Source: APXvYqx3mKLpnEU1tXm4of1yE0U8b/ehg1iYffQT+AkdN/WdS7DWNR+FG/+jrgJUcZ5mgAPKfh6IqA== X-Received: by 2002:a63:5c4c:: with SMTP id n12mr68194711pgm.111.1558239327376; Sat, 18 May 2019 21:15:27 -0700 (PDT) Received: from localhost.localdomain (97-113-13-231.tukw.qwest.net. [97.113.13.231]) by smtp.gmail.com with ESMTPSA id k63sm19381200pfb.108.2019.05.18.21.15.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 18 May 2019 21:15:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 18 May 2019 21:15:17 -0700 Message-Id: <20190519041522.12327-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190519041522.12327-1-richard.henderson@linaro.org> References: <20190519041522.12327-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::542 Subject: [Qemu-devel] [PATCH v4 2/7] tcg/ppc: Support vector shift by immediate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.cave-ayland@ilande.co.uk Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP For Altivec, this is done via vector shift by vector, and loading the immediate into a register. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.h | 2 +- tcg/ppc/tcg-target.inc.c | 58 ++++++++++++++++++++++++++++++++++++++-- 2 files changed, 57 insertions(+), 3 deletions(-) diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h index 368c250c6a..766706fd30 100644 --- a/tcg/ppc/tcg-target.h +++ b/tcg/ppc/tcg-target.h @@ -152,7 +152,7 @@ extern bool have_isa_3_00; #define TCG_TARGET_HAS_abs_vec 0 #define TCG_TARGET_HAS_shi_vec 0 #define TCG_TARGET_HAS_shs_vec 0 -#define TCG_TARGET_HAS_shv_vec 0 +#define TCG_TARGET_HAS_shv_vec 1 #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 0 #define TCG_TARGET_HAS_sat_vec 1 diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c index 479e653da6..62a8c428e0 100644 --- a/tcg/ppc/tcg-target.inc.c +++ b/tcg/ppc/tcg-target.inc.c @@ -517,6 +517,16 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VCMPGTUH VX4(582) #define VCMPGTUW VX4(646) +#define VSLB VX4(260) +#define VSLH VX4(324) +#define VSLW VX4(388) +#define VSRB VX4(516) +#define VSRH VX4(580) +#define VSRW VX4(644) +#define VSRAB VX4(772) +#define VSRAH VX4(836) +#define VSRAW VX4(900) + #define VAND VX4(1028) #define VANDC VX4(1092) #define VNOR VX4(1284) @@ -2877,8 +2887,14 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_sssub_vec: case INDEX_op_usadd_vec: case INDEX_op_ussub_vec: + case INDEX_op_shlv_vec: + case INDEX_op_shrv_vec: + case INDEX_op_sarv_vec: return vece <= MO_32; case INDEX_op_cmp_vec: + case INDEX_op_shli_vec: + case INDEX_op_shri_vec: + case INDEX_op_sari_vec: return vece <= MO_32 ? -1 : 0; default: return 0; @@ -2986,7 +3002,10 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, umin_op[4] = { VMINUB, VMINUH, VMINUW, 0 }, smin_op[4] = { VMINSB, VMINSH, VMINSW, 0 }, umax_op[4] = { VMAXUB, VMAXUH, VMAXUW, 0 }, - smax_op[4] = { VMAXSB, VMAXSH, VMAXSW, 0 }; + smax_op[4] = { VMAXSB, VMAXSH, VMAXSW, 0 }, + shlv_op[4] = { VSLB, VSLH, VSLW, 0 }, + shrv_op[4] = { VSRB, VSRH, VSRW, 0 }, + sarv_op[4] = { VSRAB, VSRAH, VSRAW, 0 }; TCGType type = vecl + TCG_TYPE_V64; TCGArg a0 = args[0], a1 = args[1], a2 = args[2]; @@ -3033,6 +3052,15 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_umax_vec: insn = umax_op[vece]; break; + case INDEX_op_shlv_vec: + insn = shlv_op[vece]; + break; + case INDEX_op_shrv_vec: + insn = shrv_op[vece]; + break; + case INDEX_op_sarv_vec: + insn = sarv_op[vece]; + break; case INDEX_op_and_vec: insn = VAND; break; @@ -3077,6 +3105,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, tcg_out32(s, insn | VRT(a0) | VRA(a1) | VRB(a2)); } +static void expand_vec_shi(TCGType type, unsigned vece, TCGv_vec v0, + TCGv_vec v1, TCGArg imm, TCGOpcode opci) +{ + TCGv_vec t1 = tcg_temp_new_vec(type); + + /* Splat w/bytes for xxspltib. */ + tcg_gen_dupi_vec(MO_8, t1, imm & ((8 << vece) - 1)); + vec_gen_3(opci, type, vece, tcgv_vec_arg(v0), + tcgv_vec_arg(v1), tcgv_vec_arg(t1)); + tcg_temp_free_vec(t1); +} + static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, TCGv_vec v1, TCGv_vec v2, TCGCond cond) { @@ -3128,14 +3168,25 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, { va_list va; TCGv_vec v0, v1, v2; + TCGArg a2; va_start(va, a0); v0 = temp_tcgv_vec(arg_temp(a0)); v1 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); - v2 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); + a2 = va_arg(va, TCGArg); switch (opc) { + case INDEX_op_shli_vec: + expand_vec_shi(type, vece, v0, v1, a2, INDEX_op_shlv_vec); + break; + case INDEX_op_shri_vec: + expand_vec_shi(type, vece, v0, v1, a2, INDEX_op_shrv_vec); + break; + case INDEX_op_sari_vec: + expand_vec_shi(type, vece, v0, v1, a2, INDEX_op_sarv_vec); + break; case INDEX_op_cmp_vec: + v2 = temp_tcgv_vec(arg_temp(a2)); expand_vec_cmp(type, vece, v0, v1, v2, va_arg(va, TCGArg)); break; default: @@ -3336,6 +3387,9 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_smin_vec: case INDEX_op_umax_vec: case INDEX_op_umin_vec: + case INDEX_op_shlv_vec: + case INDEX_op_shrv_vec: + case INDEX_op_sarv_vec: return &v_v_v; case INDEX_op_not_vec: case INDEX_op_dup_vec: From patchwork Sun May 19 04:15:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 10949271 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 181301398 for ; Sun, 19 May 2019 04:36:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 027DB2856F for ; Sun, 19 May 2019 04:36:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E50D52858E; Sun, 19 May 2019 04:35:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3B6252856F for ; Sun, 19 May 2019 04:35:59 +0000 (UTC) Received: from localhost ([127.0.0.1]:43753 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDYC-0001wY-JR for patchwork-qemu-devel@patchwork.kernel.org; Sun, 19 May 2019 00:35:56 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44461) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDSo-0005xg-Fa for qemu-devel@nongnu.org; Sun, 19 May 2019 00:30:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hSDEQ-0004mH-Vl for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:32 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:43228) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hSDEQ-0004lP-I3 for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:30 -0400 Received: by mail-pg1-x542.google.com with SMTP id t22so5140092pgi.10 for ; Sat, 18 May 2019 21:15:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=fTrWQvU6AxYkKMinL21lSwX4DniTxfJCDaDFb9S03jI=; b=AgUxub1DrMQK8SaSS4EBSasKgXOW1Qq89FI8sMdzApCqgHK18V1QCfDvhZ73z3lF3k /eeiyT8kdrj+2gQ/Kt1JQ4D5n85cPgemrahs8pwdoiCiL1M3pbt4l5wU4PJdgeLk5sOV R0p3Mkcfk8Kae0f8ncgxZIPpN80lLLioASW1iuVqDc2eipG5Q9d2qTv1vh3x9Min2Fqy HXcQwnyr43bV89Ax6Nv1FKQnTGPpYSAjIkEBbZTmh7AwXYLrT/r508nObZeYJQ3lSRnz /JmuEHji9ovgLTwG4prbJ3nbT/BkthPOxYXSe3ip5Kqy5sXqLf5vzyhYAYO5SxazVZNv ZgXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=fTrWQvU6AxYkKMinL21lSwX4DniTxfJCDaDFb9S03jI=; b=QrNPWwQjyy/T5cFyrsiVGYTvOkHBgbtVL6shmZKvTbHqpaxhEnT3CtQCkqxsrMD1QS cykM+ssMRg/nSvb8ac4qI/dMtnMf4vz/Ql9IidCRVIXuDlPBUAAfzF6zwW2ygwOVQYbt epWCfbtKFGZdZaVveP80VyXxjlSyEIyy1ak2MZuUpzhsq51J8D8UVAO9mwteZiIQ6fPF KTvSnjL3rN51MqLSsowsmLL8k0ae+7+6fis1zYCiwwj3/6s7HpG71bZLuC5RRONajQsD CmQvIAsWSYYmF60qOmNNc0IunnIItxk1yWVJdiAVPESrQfyVMNfJeyV5UZbIFcY7xy7B 0vTQ== X-Gm-Message-State: APjAAAX35wCo2F2AEDfbCVRWiesn57923J9gv25HeoMDvW/OnQLtwnpJ 5Inr2m4+ev1l9THTezqMaMpVIU/WsNY= X-Google-Smtp-Source: APXvYqz9R+sYix21Ikf03cfxpmbbQMZW0QWQsxRQMvlvcRsi5lWStYLanWlOHF7jXl8Z3zroy5XMGQ== X-Received: by 2002:a65:478a:: with SMTP id e10mr67994729pgs.310.1558239328640; Sat, 18 May 2019 21:15:28 -0700 (PDT) Received: from localhost.localdomain (97-113-13-231.tukw.qwest.net. [97.113.13.231]) by smtp.gmail.com with ESMTPSA id k63sm19381200pfb.108.2019.05.18.21.15.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 18 May 2019 21:15:27 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 18 May 2019 21:15:18 -0700 Message-Id: <20190519041522.12327-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190519041522.12327-1-richard.henderson@linaro.org> References: <20190519041522.12327-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::542 Subject: [Qemu-devel] [PATCH v4 3/7] tcg/ppc: Support vector multiply X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.cave-ayland@ilande.co.uk Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP For Altivec, this is always an expansion. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.h | 2 +- tcg/ppc/tcg-target.opc.h | 8 +++ tcg/ppc/tcg-target.inc.c | 112 ++++++++++++++++++++++++++++++++++++++- 3 files changed, 120 insertions(+), 2 deletions(-) diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h index 766706fd30..a130192cbd 100644 --- a/tcg/ppc/tcg-target.h +++ b/tcg/ppc/tcg-target.h @@ -154,7 +154,7 @@ extern bool have_isa_3_00; #define TCG_TARGET_HAS_shs_vec 0 #define TCG_TARGET_HAS_shv_vec 1 #define TCG_TARGET_HAS_cmp_vec 1 -#define TCG_TARGET_HAS_mul_vec 0 +#define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 1 #define TCG_TARGET_HAS_bitsel_vec 0 diff --git a/tcg/ppc/tcg-target.opc.h b/tcg/ppc/tcg-target.opc.h index 4816a6c3d4..5c6a5ad52c 100644 --- a/tcg/ppc/tcg-target.opc.h +++ b/tcg/ppc/tcg-target.opc.h @@ -1,3 +1,11 @@ /* Target-specific opcodes for host vector expansion. These will be emitted by tcg_expand_vec_op. For those familiar with GCC internals, consider these to be UNSPEC with names. */ + +DEF(ppc_mrgh_vec, 1, 2, 0, IMPLVEC) +DEF(ppc_mrgl_vec, 1, 2, 0, IMPLVEC) +DEF(ppc_msum_vec, 1, 3, 0, IMPLVEC) +DEF(ppc_muleu_vec, 1, 2, 0, IMPLVEC) +DEF(ppc_mulou_vec, 1, 2, 0, IMPLVEC) +DEF(ppc_pkum_vec, 1, 2, 0, IMPLVEC) +DEF(ppc_rotl_vec, 1, 2, 0, IMPLVEC) diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c index 62a8c428e0..9d58db9eb1 100644 --- a/tcg/ppc/tcg-target.inc.c +++ b/tcg/ppc/tcg-target.inc.c @@ -526,6 +526,25 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VSRAB VX4(772) #define VSRAH VX4(836) #define VSRAW VX4(900) +#define VRLB VX4(4) +#define VRLH VX4(68) +#define VRLW VX4(132) + +#define VMULEUB VX4(520) +#define VMULEUH VX4(584) +#define VMULOUB VX4(8) +#define VMULOUH VX4(72) +#define VMSUMUHM VX4(38) + +#define VMRGHB VX4(12) +#define VMRGHH VX4(76) +#define VMRGHW VX4(140) +#define VMRGLB VX4(268) +#define VMRGLH VX4(332) +#define VMRGLW VX4(396) + +#define VPKUHUM VX4(14) +#define VPKUWUM VX4(78) #define VAND VX4(1028) #define VANDC VX4(1092) @@ -2892,6 +2911,7 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_sarv_vec: return vece <= MO_32; case INDEX_op_cmp_vec: + case INDEX_op_mul_vec: case INDEX_op_shli_vec: case INDEX_op_shri_vec: case INDEX_op_sari_vec: @@ -3005,7 +3025,13 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, smax_op[4] = { VMAXSB, VMAXSH, VMAXSW, 0 }, shlv_op[4] = { VSLB, VSLH, VSLW, 0 }, shrv_op[4] = { VSRB, VSRH, VSRW, 0 }, - sarv_op[4] = { VSRAB, VSRAH, VSRAW, 0 }; + sarv_op[4] = { VSRAB, VSRAH, VSRAW, 0 }, + mrgh_op[4] = { VMRGHB, VMRGHH, VMRGHW, 0 }, + mrgl_op[4] = { VMRGLB, VMRGLH, VMRGLW, 0 }, + muleu_op[4] = { VMULEUB, VMULEUH, 0, 0 }, + mulou_op[4] = { VMULOUB, VMULOUH, 0, 0 }, + pkum_op[4] = { VPKUHUM, VPKUWUM, 0, 0 }, + rotl_op[4] = { VRLB, VRLH, VRLW, 0 }; TCGType type = vecl + TCG_TYPE_V64; TCGArg a0 = args[0], a1 = args[1], a2 = args[2]; @@ -3094,6 +3120,29 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, } break; + case INDEX_op_ppc_mrgh_vec: + insn = mrgh_op[vece]; + break; + case INDEX_op_ppc_mrgl_vec: + insn = mrgl_op[vece]; + break; + case INDEX_op_ppc_muleu_vec: + insn = muleu_op[vece]; + break; + case INDEX_op_ppc_mulou_vec: + insn = mulou_op[vece]; + break; + case INDEX_op_ppc_pkum_vec: + insn = pkum_op[vece]; + break; + case INDEX_op_ppc_rotl_vec: + insn = rotl_op[vece]; + break; + case INDEX_op_ppc_msum_vec: + tcg_debug_assert(vece == MO_16); + tcg_out32(s, VMSUMUHM | VRT(a0) | VRA(a1) | VRB(a2) | VRC(args[3])); + return; + case INDEX_op_mov_vec: /* Always emitted via tcg_out_mov. */ case INDEX_op_dupi_vec: /* Always emitted via tcg_out_movi. */ case INDEX_op_dup_vec: /* Always emitted via tcg_out_dup_vec. */ @@ -3163,6 +3212,53 @@ static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, } } +static void expand_vec_mul(TCGType type, unsigned vece, TCGv_vec v0, + TCGv_vec v1, TCGv_vec v2) +{ + TCGv_vec t1 = tcg_temp_new_vec(type); + TCGv_vec t2 = tcg_temp_new_vec(type); + TCGv_vec t3, t4; + + switch (vece) { + case MO_8: + case MO_16: + vec_gen_3(INDEX_op_ppc_muleu_vec, type, vece, tcgv_vec_arg(t1), + tcgv_vec_arg(v1), tcgv_vec_arg(v2)); + vec_gen_3(INDEX_op_ppc_mulou_vec, type, vece, tcgv_vec_arg(t2), + tcgv_vec_arg(v1), tcgv_vec_arg(v2)); + vec_gen_3(INDEX_op_ppc_mrgh_vec, type, vece + 1, tcgv_vec_arg(v0), + tcgv_vec_arg(t1), tcgv_vec_arg(t2)); + vec_gen_3(INDEX_op_ppc_mrgl_vec, type, vece + 1, tcgv_vec_arg(t1), + tcgv_vec_arg(t1), tcgv_vec_arg(t2)); + vec_gen_3(INDEX_op_ppc_pkum_vec, type, vece, tcgv_vec_arg(v0), + tcgv_vec_arg(v0), tcgv_vec_arg(t1)); + break; + + case MO_32: + t3 = tcg_temp_new_vec(type); + t4 = tcg_temp_new_vec(type); + tcg_gen_dupi_vec(MO_8, t4, -16); + vec_gen_3(INDEX_op_ppc_rotl_vec, type, MO_32, tcgv_vec_arg(t1), + tcgv_vec_arg(v2), tcgv_vec_arg(t4)); + vec_gen_3(INDEX_op_ppc_mulou_vec, type, MO_16, tcgv_vec_arg(t2), + tcgv_vec_arg(v1), tcgv_vec_arg(v2)); + tcg_gen_dupi_vec(MO_8, t3, 0); + vec_gen_4(INDEX_op_ppc_msum_vec, type, MO_16, tcgv_vec_arg(t3), + tcgv_vec_arg(v1), tcgv_vec_arg(t1), tcgv_vec_arg(t3)); + vec_gen_3(INDEX_op_shlv_vec, type, MO_32, tcgv_vec_arg(t3), + tcgv_vec_arg(t3), tcgv_vec_arg(t4)); + tcg_gen_add_vec(MO_32, v0, t2, t3); + tcg_temp_free_vec(t3); + tcg_temp_free_vec(t4); + break; + + default: + g_assert_not_reached(); + } + tcg_temp_free_vec(t1); + tcg_temp_free_vec(t2); +} + void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, TCGArg a0, ...) { @@ -3189,6 +3285,10 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, v2 = temp_tcgv_vec(arg_temp(a2)); expand_vec_cmp(type, vece, v0, v1, v2, va_arg(va, TCGArg)); break; + case INDEX_op_mul_vec: + v2 = temp_tcgv_vec(arg_temp(a2)); + expand_vec_mul(type, vece, v0, v1, v2); + break; default: g_assert_not_reached(); } @@ -3235,6 +3335,8 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) static const TCGTargetOpDef v_r = { .args_ct_str = { "v", "r" } }; static const TCGTargetOpDef v_v = { .args_ct_str = { "v", "v" } }; static const TCGTargetOpDef v_v_v = { .args_ct_str = { "v", "v", "v" } }; + static const TCGTargetOpDef v_v_v_v + = { .args_ct_str = { "v", "v", "v", "v" } }; switch (op) { case INDEX_op_goto_ptr: @@ -3390,6 +3492,12 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_shlv_vec: case INDEX_op_shrv_vec: case INDEX_op_sarv_vec: + case INDEX_op_ppc_mrgh_vec: + case INDEX_op_ppc_mrgl_vec: + case INDEX_op_ppc_muleu_vec: + case INDEX_op_ppc_mulou_vec: + case INDEX_op_ppc_pkum_vec: + case INDEX_op_ppc_rotl_vec: return &v_v_v; case INDEX_op_not_vec: case INDEX_op_dup_vec: @@ -3398,6 +3506,8 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_st_vec: case INDEX_op_dupm_vec: return &v_r; + case INDEX_op_ppc_msum_vec: + return &v_v_v_v; default: return NULL; From patchwork Sun May 19 04:15:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 10949267 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 31D7F912 for ; Sun, 19 May 2019 04:33:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0DDB62856F for ; Sun, 19 May 2019 04:33:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 007A42858E; Sun, 19 May 2019 04:33:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 9CE432856F for ; Sun, 19 May 2019 04:33:57 +0000 (UTC) Received: from localhost ([127.0.0.1]:43706 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDWG-0000Oe-Vy for patchwork-qemu-devel@patchwork.kernel.org; Sun, 19 May 2019 00:33:57 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44461) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDSn-0005xg-GM for qemu-devel@nongnu.org; Sun, 19 May 2019 00:30:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hSDES-0004n8-OX for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:34 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:42800) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hSDES-0004ln-Hd for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:32 -0400 Received: by mail-pg1-x541.google.com with SMTP id 145so5144978pgg.9 for ; Sat, 18 May 2019 21:15:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=7uqSx6k9s5mizV7n0y2ah5U9w3ZcH+wua84njE/R9g4=; b=ufF2jnT58F6RlZWanTcePlFu2XhcSsPEFlIEA3cP40HJ5n6SIjBbEpqV/HZbQzCBFm kcqBjtf6Uv9oAPmrstfx21BqJgYPqUfeGhr5OkEVX/xZS0DwXThA0w7mVWpLnpoxUo0Z N5BFYQGQWTMGh8KTISX2GmP7i/h3kp/0apfMNGuyZIUlUnqALrrJSTWSa0yctGbsofIn h564Y1rJNrKakwjHxLUKipZ4xs655/p6md8GqwANCmvfwkVCwqIR8AuR+t2gfm7TPmOl S8xbXm/WAVbLpPg9EgFmWl/OLds52Qr3LC2HRQVdjaC85KAZLZHjM6knTJUKGXzNZYPF 8WaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=7uqSx6k9s5mizV7n0y2ah5U9w3ZcH+wua84njE/R9g4=; b=P6ZhMRcZ0aLZiJIPgNeRW0pdlas23IHGCS/8MyCmUefsOydeem0qt1qtKU96Ko2kPb 9G0IwyM4UWnb1/MzlxV5rEyxfl1g4tmsRVhxvLtn+Wl5KXVctLHw1MJ+b8E9mkPOmLYd rzUjDgq/0KKiHmZTlWZwTLQF24OJ3YStGf2PipcjnPHvGmtR026MUolbBZqJLzgb5SL/ xCjzAuxhHMjmGTTR5FVIRM5HT+SH845MmisSHvehxq+vI88Acg9Y+rjLkqb8t0M5zDXd QS4/n5PHPerAiMOdQ7PSCceWZOpBASs0OIItrx57/6JNOuCc9Gbr8Uvu8F9fySIjEOSq 8+dQ== X-Gm-Message-State: APjAAAUgqIV5iqDd6xA2shaHrd3GrJ7of7Pr1NeH3XjiTfofth+7jDfc xO/AP2+qj3bzDbr4x+rG6oXOUg5KwrA= X-Google-Smtp-Source: APXvYqxiEqAvI41pbW9iNHHM4AHNsGuqaHhC3bribNxbQvNCopQeygKa0NM+igLFQVTqUb8aLoQtGQ== X-Received: by 2002:a62:36c1:: with SMTP id d184mr50955466pfa.49.1558239329763; Sat, 18 May 2019 21:15:29 -0700 (PDT) Received: from localhost.localdomain (97-113-13-231.tukw.qwest.net. [97.113.13.231]) by smtp.gmail.com with ESMTPSA id k63sm19381200pfb.108.2019.05.18.21.15.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 18 May 2019 21:15:29 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 18 May 2019 21:15:19 -0700 Message-Id: <20190519041522.12327-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190519041522.12327-1-richard.henderson@linaro.org> References: <20190519041522.12327-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::541 Subject: [Qemu-devel] [PATCH v4 4/7] tcg/ppc: Support vector dup2 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.cave-ayland@ilande.co.uk Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP This is only used for 32-bit hosts. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.inc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c index 9d58db9eb1..3219df2e90 100644 --- a/tcg/ppc/tcg-target.inc.c +++ b/tcg/ppc/tcg-target.inc.c @@ -3120,6 +3120,14 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, } break; + case INDEX_op_dup2_vec: + assert(TCG_TARGET_REG_BITS == 32); + /* With inputs a1 = xLxx, a2 = xHxx */ + tcg_out32(s, VMRGHW | VRT(a0) | VRA(a2) | VRB(a1)); /* a0 = xxHL */ + tcg_out_vsldoi(s, TCG_VEC_TMP1, a0, a0, 8); /* tmp = HLxx */ + tcg_out_vsldoi(s, a0, a0, TCG_VEC_TMP1, 8); /* a0 = HLHL */ + return; + case INDEX_op_ppc_mrgh_vec: insn = mrgh_op[vece]; break; @@ -3498,6 +3506,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_ppc_mulou_vec: case INDEX_op_ppc_pkum_vec: case INDEX_op_ppc_rotl_vec: + case INDEX_op_dup2_vec: return &v_v_v; case INDEX_op_not_vec: case INDEX_op_dup_vec: From patchwork Sun May 19 04:15:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 10949261 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B5E96912 for ; Sun, 19 May 2019 04:32:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A43732857E for ; Sun, 19 May 2019 04:32:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 96FFE285AF; Sun, 19 May 2019 04:32:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 478E22857E for ; Sun, 19 May 2019 04:32:04 +0000 (UTC) Received: from localhost ([127.0.0.1]:43686 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDUR-0007Pm-Ik for patchwork-qemu-devel@patchwork.kernel.org; Sun, 19 May 2019 00:32:03 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44218) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDSn-0005d1-GE for qemu-devel@nongnu.org; Sun, 19 May 2019 00:30:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hSDES-0004nE-Pm for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:34 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:37190) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hSDES-0004mV-JB for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:32 -0400 Received: by mail-pl1-x643.google.com with SMTP id p15so5159286pll.4 for ; Sat, 18 May 2019 21:15:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6/HEiycpcQzccLPdSh6Q1zLPpis03yT1JpL1zJXm4b0=; b=fdJ3zGHJnlYXhGiSCivxYWc9TGJ7t9NbClEplviwFo1xVeA4u1izeHTczpY1HH6z0S k6GlzuWbUbdGbMSBR6qJ75RNfS3cjKKUUtnGqrmE+XJb5lPx8snw48UJdMGfB6wKyG7y w82rw6KClCZM79ggifzPCD+Efq25wB7DrYoFHGVZD6/H8Vuj/GDKT1BDHOGSL5uy4tON pFnRcjzq0WUDMe5KnO9nBbr/Ih+eWtspjqj0vIS1BOs5LVZaWmUWiTKU8/SNCI74/DRW uPt0OYIQbUo5tuE/aV2DdoigjYA3Wh4rfzYv4/Vzd1hqaKtYm4+U3e8qT7PVX7sgAKG6 483g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6/HEiycpcQzccLPdSh6Q1zLPpis03yT1JpL1zJXm4b0=; b=c/x8+1jjgUL1o1yzqKZPhy80s7uwUcyTpbRHolcLMAWyHUx1eHjf6+qKhHgf3igMdf 7/+LuHNsH1Q6AMZ3rZjZ70P1Hmw2T1WkgcJ+u/VwttX5JVan3+IQ/oK4MJEjigjTyAiC D6PQ/q1RoLd+I4Wx9pC3v6NLMI9EaznJmpFIuHTD6u/MTFu+cF1IKBGHN2slCy0Cd5Dt lFpDucOoWzN+dj7pyENl8d3balQDk1h1UsUkDLVS+Ov4LB0rGqWDJ+ee0HufgTb5zG5s VCSHtdL5pqF4bD3jQdl51aWeU67LE6xAkelBp/Ixz0mQqv6klS4EbxHw177X4ihMklxk EsEw== X-Gm-Message-State: APjAAAWbSHj6PWuZvuJHc8OCbDwb0NQAaABPlYljFghaWPkzBV64XT9v oiDeWCsz/V5td1NSwlcRjTHr9OrN9wE= X-Google-Smtp-Source: APXvYqxKPIQlzM6cA0cHX1S82Hwrzd9Gfd1TiH34r1sxEXeZWWhcmHSNaT/MUVDmYkNjOYWjVY1K/g== X-Received: by 2002:a17:902:2947:: with SMTP id g65mr48803059plb.115.1558239330846; Sat, 18 May 2019 21:15:30 -0700 (PDT) Received: from localhost.localdomain (97-113-13-231.tukw.qwest.net. [97.113.13.231]) by smtp.gmail.com with ESMTPSA id k63sm19381200pfb.108.2019.05.18.21.15.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 18 May 2019 21:15:30 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 18 May 2019 21:15:20 -0700 Message-Id: <20190519041522.12327-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190519041522.12327-1-richard.henderson@linaro.org> References: <20190519041522.12327-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::643 Subject: [Qemu-devel] [PATCH v4 5/7] tcg/ppc: Update vector support to v2.06 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.cave-ayland@ilande.co.uk Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP This includes double-word loads and stores, double-word load and splat, double-word permute, and bit select. All of which require multiple operations in the base Altivec instruction set. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.h | 5 ++-- tcg/ppc/tcg-target.inc.c | 51 ++++++++++++++++++++++++++++++++++++---- 2 files changed, 50 insertions(+), 6 deletions(-) diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h index a130192cbd..40544f996d 100644 --- a/tcg/ppc/tcg-target.h +++ b/tcg/ppc/tcg-target.h @@ -60,6 +60,7 @@ typedef enum { extern bool have_isa_altivec; extern bool have_isa_2_06; +extern bool have_isa_2_06_vsx; extern bool have_isa_3_00; /* optional instructions automatically implemented */ @@ -141,7 +142,7 @@ extern bool have_isa_3_00; * instruction and substituting two 32-bit stores makes the generated * code quite large. */ -#define TCG_TARGET_HAS_v64 0 +#define TCG_TARGET_HAS_v64 have_isa_2_06_vsx #define TCG_TARGET_HAS_v128 have_isa_altivec #define TCG_TARGET_HAS_v256 0 @@ -157,7 +158,7 @@ extern bool have_isa_3_00; #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 1 -#define TCG_TARGET_HAS_bitsel_vec 0 +#define TCG_TARGET_HAS_bitsel_vec have_isa_2_06_vsx #define TCG_TARGET_HAS_cmpsel_vec 0 void flush_icache_range(uintptr_t start, uintptr_t stop); diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c index 3219df2e90..6cb8c8f0eb 100644 --- a/tcg/ppc/tcg-target.inc.c +++ b/tcg/ppc/tcg-target.inc.c @@ -66,6 +66,7 @@ static tcg_insn_unit *tb_ret_addr; bool have_isa_altivec; bool have_isa_2_06; +bool have_isa_2_06_vsx; bool have_isa_3_00; #define HAVE_ISA_2_06 have_isa_2_06 @@ -470,9 +471,12 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define LVEBX XO31(7) #define LVEHX XO31(39) #define LVEWX XO31(71) +#define LXSDX XO31(588) /* v2.06 */ +#define LXVDSX XO31(332) /* v2.06 */ #define STVX XO31(231) #define STVEWX XO31(199) +#define STXSDX XO31(716) /* v2.06 */ #define VADDSBS VX4(768) #define VADDUBS VX4(512) @@ -561,6 +565,9 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VSLDOI VX4(44) +#define XXPERMDI (OPCD(60) | (10 << 3)) /* v2.06 */ +#define XXSEL (OPCD(60) | (3 << 4)) /* v2.06 */ + #define RT(r) ((r)<<21) #define RS(r) ((r)<<21) #define RA(r) ((r)<<16) @@ -887,11 +894,21 @@ static void tcg_out_dupi_vec(TCGContext *s, TCGType type, TCGReg ret, add = 0; } - load_insn = LVX | VRT(ret) | RB(TCG_REG_TMP1); - if (TCG_TARGET_REG_BITS == 64) { - new_pool_l2(s, rel, s->code_ptr, add, val, val); + if (have_isa_2_06_vsx) { + load_insn = type == TCG_TYPE_V64 ? LXSDX : LXVDSX; + load_insn |= VRT(ret) | RB(TCG_REG_TMP1) | 1; + if (TCG_TARGET_REG_BITS == 64) { + new_pool_label(s, val, rel, s->code_ptr, add); + } else { + new_pool_l2(s, rel, s->code_ptr, add, val, val); + } } else { - new_pool_l4(s, rel, s->code_ptr, add, val, val, val, val); + load_insn = LVX | VRT(ret) | RB(TCG_REG_TMP1); + if (TCG_TARGET_REG_BITS == 64) { + new_pool_l2(s, rel, s->code_ptr, add, val, val); + } else { + new_pool_l4(s, rel, s->code_ptr, add, val, val, val, val); + } } if (USE_REG_TB) { @@ -1138,6 +1155,10 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, /* fallthru */ case TCG_TYPE_V64: tcg_debug_assert(ret >= 32); + if (have_isa_2_06_vsx) { + tcg_out_mem_long(s, 0, LXSDX | 1, ret & 31, base, offset); + break; + } assert((offset & 7) == 0); tcg_out_mem_long(s, 0, LVX, ret & 31, base, offset & -16); if (offset & 8) { @@ -1181,6 +1202,10 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg, /* fallthru */ case TCG_TYPE_V64: tcg_debug_assert(arg >= 32); + if (have_isa_2_06_vsx) { + tcg_out_mem_long(s, 0, STXSDX | 1, arg & 31, base, offset); + break; + } assert((offset & 7) == 0); if (offset & 8) { tcg_out_vsldoi(s, TCG_VEC_TMP1, arg, arg, 8); @@ -2916,6 +2941,8 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_shri_vec: case INDEX_op_sari_vec: return vece <= MO_32 ? -1 : 0; + case INDEX_op_bitsel_vec: + return have_isa_2_06_vsx; default: return 0; } @@ -2942,6 +2969,10 @@ static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, tcg_out32(s, VSPLTW | VRT(dst) | VRB(src) | (1 << 16)); break; case MO_64: + if (have_isa_2_06_vsx) { + tcg_out32(s, XXPERMDI | 7 | VRT(dst) | VRA(src) | VRB(src)); + break; + } tcg_out_vsldoi(s, TCG_VEC_TMP1, src, src, 8); tcg_out_vsldoi(s, dst, TCG_VEC_TMP1, src, 8); break; @@ -2986,6 +3017,10 @@ static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, tcg_out32(s, VSPLTW | VRT(out) | VRB(out) | (elt << 16)); break; case MO_64: + if (have_isa_2_06_vsx) { + tcg_out_mem_long(s, 0, LXVDSX | 1, out, base, offset); + break; + } assert((offset & 7) == 0); tcg_out_mem_long(s, 0, LVX, out, base, offset & -16); tcg_out_vsldoi(s, TCG_VEC_TMP1, out, out, 8); @@ -3120,6 +3155,10 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, } break; + case INDEX_op_bitsel_vec: + tcg_out32(s, XXSEL | 0xf | VRT(a0) | VRC(a1) | VRB(a2) | VRA(args[3])); + return; + case INDEX_op_dup2_vec: assert(TCG_TARGET_REG_BITS == 32); /* With inputs a1 = xLxx, a2 = xHxx */ @@ -3515,6 +3554,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_st_vec: case INDEX_op_dupm_vec: return &v_r; + case INDEX_op_bitsel_vec: case INDEX_op_ppc_msum_vec: return &v_v_v_v; @@ -3533,6 +3573,9 @@ static void tcg_target_init(TCGContext *s) } if (hwcap & PPC_FEATURE_ARCH_2_06) { have_isa_2_06 = true; + if (hwcap & PPC_FEATURE_HAS_VSX) { + have_isa_2_06_vsx = true; + } } #ifdef PPC_FEATURE2_ARCH_3_00 if (hwcap2 & PPC_FEATURE2_ARCH_3_00) { From patchwork Sun May 19 04:15:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 10949265 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BB98A912 for ; Sun, 19 May 2019 04:32:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A93622856F for ; Sun, 19 May 2019 04:32:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9DA2D2858E; Sun, 19 May 2019 04:32:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A34412856F for ; Sun, 19 May 2019 04:32:03 +0000 (UTC) Received: from localhost ([127.0.0.1]:43684 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDUQ-0007PV-VB for patchwork-qemu-devel@patchwork.kernel.org; Sun, 19 May 2019 00:32:03 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44461) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDSl-0005xg-Lh for qemu-devel@nongnu.org; Sun, 19 May 2019 00:30:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hSDEV-0004oC-6L for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:36 -0400 Received: from mail-pg1-x543.google.com ([2607:f8b0:4864:20::543]:34026) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hSDEU-0004nI-UN for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:35 -0400 Received: by mail-pg1-x543.google.com with SMTP id c13so5162668pgt.1 for ; Sat, 18 May 2019 21:15:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=i1NgkQuZk10k1ceiKaxM9hxXyQyjYikTQa3PtpV7vqI=; b=RaKTvM87jfK7kpIweYbfN59cIiK5mmwsgMEK3bQKV/S7j943ZUbXrWSK4+vwkoxSCr 3rsS4hWRif0iFD78PnWJPTnEQoGSlleCCTsIdBAKpYKbW8E7wNzPOs7nhT4d9xV1VlaF cp7qY0VCRB1+46xpHERqH8SBfBBH9vL35TTyVA5wGQbEGcYauWXp94g2kvxf8kvE/nEC ihlOnmA6OiOotGTRXqDrvuf/xHXw9pT4VeX6HD1K4SqAU+CrEFJ4EPJIewdnl/v3Mqdc Hi66oFqLvrSHSIK/fRlRy++oWiqdsCf1wGJTqnFyZC1Hd03aVgH0vjuUf6KLHBwR0nEY DVDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=i1NgkQuZk10k1ceiKaxM9hxXyQyjYikTQa3PtpV7vqI=; b=r4FNNTbK3PDdNnRojKQvP2TI9V5TJpfDG+yUfcJr0PC3yoRLG348ZMrMaI0bGT+Uge wnrjtU1D5uiTDNW+XhSkK+VF8gdPmsbthag1ZWhvFqNfgl1IQ0VchFLMU3hhAfslOvnn 1L/GalgRkgUDr/gVa+p11ukZPFlbobzdyYH9oCwHHWAER0DvTOKVajQL2hLAAQmUNvaD lKXXZfKuTp6NNcrn/iT+LpmdJ75hzO5aNMOETU9WStmkJqw4Wu67YgQmEMOblrLfd73b M/V3Egvwpf/DLGnMLsOD1a2xg3cXLHmdDmiqiaA2faOHTbyJ97KobHtaEEl70soTb1MU yBmA== X-Gm-Message-State: APjAAAWl6KuamtfeVgem+D8yQ1866ARIlo7QignFda/1MiUt90aOskzr JzWXk3w8etUXbYp7KTY3AmPZzmtXxtY= X-Google-Smtp-Source: APXvYqxCHK2O+PfzuhCArKSiZIOCNOZrGIww3N7XxHV9V7XNHbJxJzSWJcxd1cyODU2UnMzb4D3dLg== X-Received: by 2002:a63:6988:: with SMTP id e130mr68362551pgc.150.1558239332094; Sat, 18 May 2019 21:15:32 -0700 (PDT) Received: from localhost.localdomain (97-113-13-231.tukw.qwest.net. [97.113.13.231]) by smtp.gmail.com with ESMTPSA id k63sm19381200pfb.108.2019.05.18.21.15.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 18 May 2019 21:15:31 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 18 May 2019 21:15:21 -0700 Message-Id: <20190519041522.12327-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190519041522.12327-1-richard.henderson@linaro.org> References: <20190519041522.12327-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::543 Subject: [Qemu-devel] [PATCH v4 6/7] tcg/ppc: Update vector support to v2.07 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.cave-ayland@ilande.co.uk Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP This includes single-word loads and stores, lots of double-word arithmetic, and a few extra logical operations. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.h | 3 +- tcg/ppc/tcg-target.inc.c | 111 +++++++++++++++++++++++++++++++-------- 2 files changed, 91 insertions(+), 23 deletions(-) diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h index 40544f996d..b8355d0a56 100644 --- a/tcg/ppc/tcg-target.h +++ b/tcg/ppc/tcg-target.h @@ -61,6 +61,7 @@ typedef enum { extern bool have_isa_altivec; extern bool have_isa_2_06; extern bool have_isa_2_06_vsx; +extern bool have_isa_2_07_vsx; extern bool have_isa_3_00; /* optional instructions automatically implemented */ @@ -147,7 +148,7 @@ extern bool have_isa_3_00; #define TCG_TARGET_HAS_v256 0 #define TCG_TARGET_HAS_andc_vec 1 -#define TCG_TARGET_HAS_orc_vec 0 +#define TCG_TARGET_HAS_orc_vec have_isa_2_07_vsx #define TCG_TARGET_HAS_not_vec 1 #define TCG_TARGET_HAS_neg_vec 0 #define TCG_TARGET_HAS_abs_vec 0 diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c index 6cb8c8f0eb..dedf0de04d 100644 --- a/tcg/ppc/tcg-target.inc.c +++ b/tcg/ppc/tcg-target.inc.c @@ -67,6 +67,7 @@ static tcg_insn_unit *tb_ret_addr; bool have_isa_altivec; bool have_isa_2_06; bool have_isa_2_06_vsx; +bool have_isa_2_07_vsx; bool have_isa_3_00; #define HAVE_ISA_2_06 have_isa_2_06 @@ -473,10 +474,12 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define LVEWX XO31(71) #define LXSDX XO31(588) /* v2.06 */ #define LXVDSX XO31(332) /* v2.06 */ +#define LXSIWZX XO31(12) /* v2.07 */ #define STVX XO31(231) #define STVEWX XO31(199) #define STXSDX XO31(716) /* v2.06 */ +#define STXSIWX XO31(140) /* v2.07 */ #define VADDSBS VX4(768) #define VADDUBS VX4(512) @@ -487,6 +490,7 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VADDSWS VX4(896) #define VADDUWS VX4(640) #define VADDUWM VX4(128) +#define VADDUDM VX4(192) /* v2.07 */ #define VSUBSBS VX4(1792) #define VSUBUBS VX4(1536) @@ -497,47 +501,62 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VSUBSWS VX4(1920) #define VSUBUWS VX4(1664) #define VSUBUWM VX4(1152) +#define VSUBUDM VX4(1216) /* v2.07 */ #define VMAXSB VX4(258) #define VMAXSH VX4(322) #define VMAXSW VX4(386) +#define VMAXSD VX4(450) /* v2.07 */ #define VMAXUB VX4(2) #define VMAXUH VX4(66) #define VMAXUW VX4(130) +#define VMAXUD VX4(194) /* v2.07 */ #define VMINSB VX4(770) #define VMINSH VX4(834) #define VMINSW VX4(898) +#define VMINSD VX4(962) /* v2.07 */ #define VMINUB VX4(514) #define VMINUH VX4(578) #define VMINUW VX4(642) +#define VMINUD VX4(706) /* v2.07 */ #define VCMPEQUB VX4(6) #define VCMPEQUH VX4(70) #define VCMPEQUW VX4(134) +#define VCMPEQUD VX4(199) /* v2.07 */ #define VCMPGTSB VX4(774) #define VCMPGTSH VX4(838) #define VCMPGTSW VX4(902) +#define VCMPGTSD VX4(967) /* v2.07 */ #define VCMPGTUB VX4(518) #define VCMPGTUH VX4(582) #define VCMPGTUW VX4(646) +#define VCMPGTUD VX4(711) /* v2.07 */ #define VSLB VX4(260) #define VSLH VX4(324) #define VSLW VX4(388) +#define VSLD VX4(1476) /* v2.07 */ #define VSRB VX4(516) #define VSRH VX4(580) #define VSRW VX4(644) +#define VSRD VX4(1732) /* v2.07 */ #define VSRAB VX4(772) #define VSRAH VX4(836) #define VSRAW VX4(900) +#define VSRAD VX4(964) /* v2.07 */ #define VRLB VX4(4) #define VRLH VX4(68) #define VRLW VX4(132) +#define VRLD VX4(196) /* v2.07 */ #define VMULEUB VX4(520) #define VMULEUH VX4(584) +#define VMULEUW VX4(648) /* v2.07 */ #define VMULOUB VX4(8) #define VMULOUH VX4(72) +#define VMULOUW VX4(136) /* v2.07 */ +#define VMULUWM VX4(137) /* v2.07 */ #define VMSUMUHM VX4(38) #define VMRGHB VX4(12) @@ -555,6 +574,9 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VNOR VX4(1284) #define VOR VX4(1156) #define VXOR VX4(1220) +#define VEQV VX4(1668) /* v2.07 */ +#define VNAND VX4(1412) /* v2.07 */ +#define VORC VX4(1348) /* v2.07 */ #define VSPLTB VX4(524) #define VSPLTH VX4(588) @@ -568,6 +590,11 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define XXPERMDI (OPCD(60) | (10 << 3)) /* v2.06 */ #define XXSEL (OPCD(60) | (3 << 4)) /* v2.06 */ +#define MFVSRD XO31(51) /* v2.07 */ +#define MFVSRWZ XO31(115) /* v2.07 */ +#define MTVSRD XO31(179) /* v2.07 */ +#define MTVSRWZ XO31(179) /* v2.07 */ + #define RT(r) ((r)<<21) #define RS(r) ((r)<<21) #define RA(r) ((r)<<16) @@ -700,7 +727,15 @@ static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) if (ret < 32 && arg < 32) { tcg_out32(s, OR | SAB(arg, ret, arg)); break; - } else if (ret < 32 || arg < 32) { + } else if (ret < 32 && have_isa_2_07_vsx) { + tcg_out32(s, (type == TCG_TYPE_I32 ? MFVSRWZ : MFVSRD) + | VRT(arg) | RA(ret) | 1); + break; + } else if (arg < 32 && have_isa_2_07_vsx) { + tcg_out32(s, (type == TCG_TYPE_I32 ? MTVSRWZ : MTVSRD) + | VRT(ret) | RA(arg) | 1); + break; + } else { /* Altivec does not support vector/integer moves. */ return false; } @@ -1140,6 +1175,10 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, tcg_out_mem_long(s, LWZ, LWZX, ret, base, offset); break; } + if (have_isa_2_07_vsx) { + tcg_out_mem_long(s, 0, LXSIWZX | 1, ret & 31, base, offset); + break; + } assert((offset & 3) == 0); tcg_out_mem_long(s, 0, LVEWX, ret & 31, base, offset); shift = (offset - 4) & 0xc; @@ -1186,6 +1225,10 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg, tcg_out_mem_long(s, STW, STWX, arg, base, offset); break; } + if (have_isa_2_07_vsx) { + tcg_out_mem_long(s, 0, STXSIWX | 1, arg & 31, base, offset); + break; + } assert((offset & 3) == 0); shift = (offset - 4) & 0xc; if (shift) { @@ -2921,26 +2964,37 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_andc_vec: case INDEX_op_not_vec: return 1; + case INDEX_op_orc_vec: + return have_isa_2_07_vsx; case INDEX_op_add_vec: case INDEX_op_sub_vec: case INDEX_op_smax_vec: case INDEX_op_smin_vec: case INDEX_op_umax_vec: case INDEX_op_umin_vec: + case INDEX_op_shlv_vec: + case INDEX_op_shrv_vec: + case INDEX_op_sarv_vec: + return vece <= MO_32 || have_isa_2_07_vsx; case INDEX_op_ssadd_vec: case INDEX_op_sssub_vec: case INDEX_op_usadd_vec: case INDEX_op_ussub_vec: - case INDEX_op_shlv_vec: - case INDEX_op_shrv_vec: - case INDEX_op_sarv_vec: return vece <= MO_32; case INDEX_op_cmp_vec: - case INDEX_op_mul_vec: case INDEX_op_shli_vec: case INDEX_op_shri_vec: case INDEX_op_sari_vec: - return vece <= MO_32 ? -1 : 0; + return vece <= MO_32 || have_isa_2_07_vsx ? -1 : 0; + case INDEX_op_mul_vec: + switch (vece) { + case MO_8: + case MO_16: + return -1; + case MO_32: + return have_isa_2_07_vsx ? 1 : -1; + } + return 0; case INDEX_op_bitsel_vec: return have_isa_2_06_vsx; default: @@ -3045,28 +3099,28 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, const int *const_args) { static const uint32_t - add_op[4] = { VADDUBM, VADDUHM, VADDUWM, 0 }, - sub_op[4] = { VSUBUBM, VSUBUHM, VSUBUWM, 0 }, - eq_op[4] = { VCMPEQUB, VCMPEQUH, VCMPEQUW, 0 }, - gts_op[4] = { VCMPGTSB, VCMPGTSH, VCMPGTSW, 0 }, - gtu_op[4] = { VCMPGTUB, VCMPGTUH, VCMPGTUW, 0 }, + add_op[4] = { VADDUBM, VADDUHM, VADDUWM, VADDUDM }, + sub_op[4] = { VSUBUBM, VSUBUHM, VSUBUWM, VSUBUDM }, + eq_op[4] = { VCMPEQUB, VCMPEQUH, VCMPEQUW, VCMPEQUD }, + gts_op[4] = { VCMPGTSB, VCMPGTSH, VCMPGTSW, VCMPGTSD }, + gtu_op[4] = { VCMPGTUB, VCMPGTUH, VCMPGTUW, VCMPGTUD }, ssadd_op[4] = { VADDSBS, VADDSHS, VADDSWS, 0 }, usadd_op[4] = { VADDUBS, VADDUHS, VADDUWS, 0 }, sssub_op[4] = { VSUBSBS, VSUBSHS, VSUBSWS, 0 }, ussub_op[4] = { VSUBUBS, VSUBUHS, VSUBUWS, 0 }, - umin_op[4] = { VMINUB, VMINUH, VMINUW, 0 }, - smin_op[4] = { VMINSB, VMINSH, VMINSW, 0 }, - umax_op[4] = { VMAXUB, VMAXUH, VMAXUW, 0 }, - smax_op[4] = { VMAXSB, VMAXSH, VMAXSW, 0 }, - shlv_op[4] = { VSLB, VSLH, VSLW, 0 }, - shrv_op[4] = { VSRB, VSRH, VSRW, 0 }, - sarv_op[4] = { VSRAB, VSRAH, VSRAW, 0 }, + umin_op[4] = { VMINUB, VMINUH, VMINUW, VMINUD }, + smin_op[4] = { VMINSB, VMINSH, VMINSW, VMINSD }, + umax_op[4] = { VMAXUB, VMAXUH, VMAXUW, VMAXUD }, + smax_op[4] = { VMAXSB, VMAXSH, VMAXSW, VMAXSD }, + shlv_op[4] = { VSLB, VSLH, VSLW, VSLD }, + shrv_op[4] = { VSRB, VSRH, VSRW, VSRD }, + sarv_op[4] = { VSRAB, VSRAH, VSRAW, VSRAD }, mrgh_op[4] = { VMRGHB, VMRGHH, VMRGHW, 0 }, mrgl_op[4] = { VMRGLB, VMRGLH, VMRGLW, 0 }, - muleu_op[4] = { VMULEUB, VMULEUH, 0, 0 }, - mulou_op[4] = { VMULOUB, VMULOUH, 0, 0 }, + muleu_op[4] = { VMULEUB, VMULEUH, VMULEUW, 0 }, + mulou_op[4] = { VMULOUB, VMULOUH, VMULOUW, 0 }, pkum_op[4] = { VPKUHUM, VPKUWUM, 0, 0 }, - rotl_op[4] = { VRLB, VRLH, VRLW, 0 }; + rotl_op[4] = { VRLB, VRLH, VRLW, VRLD }; TCGType type = vecl + TCG_TYPE_V64; TCGArg a0 = args[0], a1 = args[1], a2 = args[2]; @@ -3089,6 +3143,10 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_sub_vec: insn = sub_op[vece]; break; + case INDEX_op_mul_vec: + tcg_debug_assert(vece == MO_32 && have_isa_2_07_vsx); + insn = VMULUWM; + break; case INDEX_op_ssadd_vec: insn = ssadd_op[vece]; break; @@ -3138,6 +3196,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, insn = VNOR; a2 = a1; break; + case INDEX_op_orc_vec: + insn = VORC; + break; case INDEX_op_cmp_vec: switch (args[3]) { @@ -3218,7 +3279,7 @@ static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, { bool need_swap = false, need_inv = false; - tcg_debug_assert(vece <= MO_32); + tcg_debug_assert(vece <= MO_32 || have_isa_2_07_vsx); switch (cond) { case TCG_COND_EQ: @@ -3282,6 +3343,7 @@ static void expand_vec_mul(TCGType type, unsigned vece, TCGv_vec v0, break; case MO_32: + tcg_debug_assert(!have_isa_2_07_vsx); t3 = tcg_temp_new_vec(type); t4 = tcg_temp_new_vec(type); tcg_gen_dupi_vec(MO_8, t4, -16); @@ -3577,6 +3639,11 @@ static void tcg_target_init(TCGContext *s) have_isa_2_06_vsx = true; } } + if (hwcap2 & PPC_FEATURE2_ARCH_2_07) { + if (hwcap & PPC_FEATURE_HAS_VSX) { + have_isa_2_07_vsx = true; + } + } #ifdef PPC_FEATURE2_ARCH_3_00 if (hwcap2 & PPC_FEATURE2_ARCH_3_00) { have_isa_3_00 = true; From patchwork Sun May 19 04:15:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 10949263 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7132613AD for ; Sun, 19 May 2019 04:32:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6035A2857E for ; Sun, 19 May 2019 04:32:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 51AAD285AB; Sun, 19 May 2019 04:32:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D1B7B2858E for ; Sun, 19 May 2019 04:32:04 +0000 (UTC) Received: from localhost ([127.0.0.1]:43688 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDUS-0007QL-5Q for patchwork-qemu-devel@patchwork.kernel.org; Sun, 19 May 2019 00:32:04 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44218) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDSl-0005d1-Ki for qemu-devel@nongnu.org; Sun, 19 May 2019 00:30:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hSDEV-0004oI-7w for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:37 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:44802) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hSDEU-0004nk-Vj for qemu-devel@nongnu.org; Sun, 19 May 2019 00:15:35 -0400 Received: by mail-pl1-x644.google.com with SMTP id c5so5127253pll.11 for ; Sat, 18 May 2019 21:15:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=oKgfkKq1+9tZLjqz20VcCB7egfqrJKKXg2fY4UFHrI4=; b=j2/sjQdBHABD4UhA4Rt83wxenWjCDpCr3kp5JLoxI0wKq5DBew1u7hU1T1M/uFGknB gBpgzzcqyiTs0PxVRquvA3I6gRiu7a1QOM9tG1crnBtDXGXO+Dw8oqIahq4XBXlb6enQ GEu5hUgjj7TZGNWEQtErvCsVZ7KQjeAXwS1mQfy1vFr6jAa0J0h6W4c4o26v9agILssN NpyPMOmnNvfHyQahAaFaAZRr78hbHHvPH5coR8tkro6+ShyIO2oGuXjfJ3yxA/0mR3iJ SzR86GsNbcMNNwfXNKJwvIRJTWUkPOtDI4RJg8QWX9MEqJ1JrG3yZANGuxG0WVtahdeW nK1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=oKgfkKq1+9tZLjqz20VcCB7egfqrJKKXg2fY4UFHrI4=; b=KYCP9jaACAC42vC4ceMnVpxjKz178YYHGbiJCmlWcWRZ+o9hIBo7/FnsgEIa8x/O23 OtwM+OAe8JLhIeKOZdeaCjJWC/wBAYaQ633x263lKpgqn6qjpTASBimxlCpZl3aLsX14 vZfYiXPKFcLrnUEc0Pe+7a6PDJaKU55a2JKG7XcSp46pJu4Edf9bUID9pwK6Z83LEIue gHwS1A/NAJmtTdJyNXc9uMm1CtwpT240HNCKAM1NP9wzPCc4+msk1P5Oau12hwdbxb9t NsyNgW12A+UeLEyOBfB9fv3TOwuS60OnxSm4kgNJShQv0vSREOXDPwjCFm5Z0rVr7Ld3 D0wQ== X-Gm-Message-State: APjAAAWFc+nUodrxkZAPxKxbe2CBE1hCzclB2KoqEwK4zqHBU8hpgVNt pGsS3IiWjKM4fN4JF3aClOj0cUd5YdU= X-Google-Smtp-Source: APXvYqwmPvaUfTzwZS7fsSR1I+N3fr7LAxdDMLx8ggWBmR07vYERZ5h5osT/xN+HoYs1bgbIf6NK1Q== X-Received: by 2002:a17:902:294a:: with SMTP id g68mr42807780plb.169.1558239333373; Sat, 18 May 2019 21:15:33 -0700 (PDT) Received: from localhost.localdomain (97-113-13-231.tukw.qwest.net. [97.113.13.231]) by smtp.gmail.com with ESMTPSA id k63sm19381200pfb.108.2019.05.18.21.15.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 18 May 2019 21:15:32 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 18 May 2019 21:15:22 -0700 Message-Id: <20190519041522.12327-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190519041522.12327-1-richard.henderson@linaro.org> References: <20190519041522.12327-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::644 Subject: [Qemu-devel] [PATCH v4 7/7] tcg/ppc: Update vector support to v3.00 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.cave-ayland@ilande.co.uk Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP This includes vector load/store with immediate offset, some extra move and splat insns, compare ne, and negate. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.h | 3 +- tcg/ppc/tcg-target.inc.c | 103 ++++++++++++++++++++++++++++++++++----- 2 files changed, 94 insertions(+), 12 deletions(-) diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h index b8355d0a56..533f0ef510 100644 --- a/tcg/ppc/tcg-target.h +++ b/tcg/ppc/tcg-target.h @@ -63,6 +63,7 @@ extern bool have_isa_2_06; extern bool have_isa_2_06_vsx; extern bool have_isa_2_07_vsx; extern bool have_isa_3_00; +extern bool have_isa_3_00_vsx; /* optional instructions automatically implemented */ #define TCG_TARGET_HAS_ext8u_i32 0 /* andi */ @@ -150,7 +151,7 @@ extern bool have_isa_3_00; #define TCG_TARGET_HAS_andc_vec 1 #define TCG_TARGET_HAS_orc_vec have_isa_2_07_vsx #define TCG_TARGET_HAS_not_vec 1 -#define TCG_TARGET_HAS_neg_vec 0 +#define TCG_TARGET_HAS_neg_vec have_isa_3_00_vsx #define TCG_TARGET_HAS_abs_vec 0 #define TCG_TARGET_HAS_shi_vec 0 #define TCG_TARGET_HAS_shs_vec 0 diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c index dedf0de04d..4ee77df178 100644 --- a/tcg/ppc/tcg-target.inc.c +++ b/tcg/ppc/tcg-target.inc.c @@ -69,6 +69,7 @@ bool have_isa_2_06; bool have_isa_2_06_vsx; bool have_isa_2_07_vsx; bool have_isa_3_00; +bool have_isa_3_00_vsx; #define HAVE_ISA_2_06 have_isa_2_06 #define HAVE_ISEL have_isa_2_06 @@ -475,11 +476,16 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define LXSDX XO31(588) /* v2.06 */ #define LXVDSX XO31(332) /* v2.06 */ #define LXSIWZX XO31(12) /* v2.07 */ +#define LXV (OPCD(61) | 1) /* v3.00 */ +#define LXSD (OPCD(57) | 2) /* v3.00 */ +#define LXVWSX XO31(364) /* v3.00 */ #define STVX XO31(231) #define STVEWX XO31(199) #define STXSDX XO31(716) /* v2.06 */ #define STXSIWX XO31(140) /* v2.07 */ +#define STXV (OPCD(61) | 5) /* v3.00 */ +#define STXSD (OPCD(61) | 2) /* v3.00 */ #define VADDSBS VX4(768) #define VADDUBS VX4(512) @@ -503,6 +509,9 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VSUBUWM VX4(1152) #define VSUBUDM VX4(1216) /* v2.07 */ +#define VNEGW (VX4(1538) | (6 << 16)) /* v3.00 */ +#define VNEGD (VX4(1538) | (7 << 16)) /* v3.00 */ + #define VMAXSB VX4(258) #define VMAXSH VX4(322) #define VMAXSW VX4(386) @@ -532,6 +541,9 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VCMPGTUH VX4(582) #define VCMPGTUW VX4(646) #define VCMPGTUD VX4(711) /* v2.07 */ +#define VCMPNEB VX4(7) /* v3.00 */ +#define VCMPNEH VX4(71) /* v3.00 */ +#define VCMPNEW VX4(135) /* v3.00 */ #define VSLB VX4(260) #define VSLH VX4(324) @@ -589,11 +601,14 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define XXPERMDI (OPCD(60) | (10 << 3)) /* v2.06 */ #define XXSEL (OPCD(60) | (3 << 4)) /* v2.06 */ +#define XXSPLTIB (OPCD(60) | (360 << 1)) /* v3.00 */ #define MFVSRD XO31(51) /* v2.07 */ #define MFVSRWZ XO31(115) /* v2.07 */ #define MTVSRD XO31(179) /* v2.07 */ #define MTVSRWZ XO31(179) /* v2.07 */ +#define MTVSRDD XO31(435) /* v3.00 */ +#define MTVSRWS XO31(403) /* v3.00 */ #define RT(r) ((r)<<21) #define RS(r) ((r)<<21) @@ -917,6 +932,10 @@ static void tcg_out_dupi_vec(TCGContext *s, TCGType type, TCGReg ret, return; } } + if (have_isa_3_00_vsx && val == (tcg_target_long)dup_const(MO_8, val)) { + tcg_out32(s, XXSPLTIB | VRT(ret) | ((val & 0xff) << 11) | 1); + return; + } /* * Otherwise we must load the value from the constant pool. @@ -1105,7 +1124,7 @@ static void tcg_out_mem_long(TCGContext *s, int opi, int opx, TCGReg rt, TCGReg base, tcg_target_long offset) { tcg_target_long orig = offset, l0, l1, extra = 0, align = 0; - bool is_store = false; + bool is_int_store = false; TCGReg rs = TCG_REG_TMP1; switch (opi) { @@ -1118,11 +1137,20 @@ static void tcg_out_mem_long(TCGContext *s, int opi, int opx, TCGReg rt, break; } break; + case LXSD: + case STXSD: + align = 3; + break; + case LXV: case LXV | 8: + case STXV: case STXV | 8: + /* The |8 cases force altivec registers. */ + align = 15; + break; case STD: align = 3; /* FALLTHRU */ case STB: case STH: case STW: - is_store = true; + is_int_store = true; break; } @@ -1131,7 +1159,7 @@ static void tcg_out_mem_long(TCGContext *s, int opi, int opx, TCGReg rt, if (rs == base) { rs = TCG_REG_R0; } - tcg_debug_assert(!is_store || rs != rt); + tcg_debug_assert(!is_int_store || rs != rt); tcg_out_movi(s, TCG_TYPE_PTR, rs, orig); tcg_out32(s, opx | TAB(rt, base, rs)); return; @@ -1195,7 +1223,8 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, case TCG_TYPE_V64: tcg_debug_assert(ret >= 32); if (have_isa_2_06_vsx) { - tcg_out_mem_long(s, 0, LXSDX | 1, ret & 31, base, offset); + tcg_out_mem_long(s, have_isa_3_00_vsx ? LXSD : 0, LXSDX | 1, + ret & 31, base, offset); break; } assert((offset & 7) == 0); @@ -1207,7 +1236,8 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, case TCG_TYPE_V128: tcg_debug_assert(ret >= 32); assert((offset & 15) == 0); - tcg_out_mem_long(s, 0, LVX, ret & 31, base, offset); + tcg_out_mem_long(s, have_isa_3_00_vsx ? LXV | 8 : 0, LVX, + ret & 31, base, offset); break; default: g_assert_not_reached(); @@ -1246,7 +1276,8 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg, case TCG_TYPE_V64: tcg_debug_assert(arg >= 32); if (have_isa_2_06_vsx) { - tcg_out_mem_long(s, 0, STXSDX | 1, arg & 31, base, offset); + tcg_out_mem_long(s, have_isa_3_00_vsx ? STXSD : 0, + STXSDX | 1, arg & 31, base, offset); break; } assert((offset & 7) == 0); @@ -1259,7 +1290,8 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg, break; case TCG_TYPE_V128: tcg_debug_assert(arg >= 32); - tcg_out_mem_long(s, 0, STVX, arg & 31, base, offset); + tcg_out_mem_long(s, have_isa_3_00_vsx ? STXV | 8 : 0, STVX, + arg & 31, base, offset); break; default: g_assert_not_reached(); @@ -2986,6 +3018,8 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_shri_vec: case INDEX_op_sari_vec: return vece <= MO_32 || have_isa_2_07_vsx ? -1 : 0; + case INDEX_op_neg_vec: + return vece >= MO_32 && have_isa_3_00_vsx; case INDEX_op_mul_vec: switch (vece) { case MO_8: @@ -3006,7 +3040,22 @@ static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, TCGReg dst, TCGReg src) { tcg_debug_assert(dst >= 32); - tcg_debug_assert(src >= 32); + + /* Splat from integer reg allowed via constraints for v3.00. */ + if (src < 32) { + tcg_debug_assert(have_isa_3_00_vsx); + switch (vece) { + case MO_64: + tcg_out32(s, MTVSRDD | 1 | VRT(dst) | RA(src) | RB(src)); + return true; + case MO_32: + tcg_out32(s, MTVSRWS | 1 | VRT(dst) | RA(src)); + return true; + default: + /* Fail, so that we fall back on either dupm or mov+dup. */ + return false; + } + } /* * Recall we use (or emulate) VSX integer loads, so the integer is @@ -3045,7 +3094,11 @@ static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, out &= 31; switch (vece) { case MO_8: - tcg_out_mem_long(s, 0, LVEBX, out, base, offset); + if (have_isa_3_00_vsx) { + tcg_out_mem_long(s, LXV | 8, LVX, out, base, offset & -16); + } else { + tcg_out_mem_long(s, 0, LVEBX, out, base, offset); + } elt = extract32(offset, 0, 4); #ifndef HOST_WORDS_BIGENDIAN elt ^= 15; @@ -3054,7 +3107,11 @@ static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, break; case MO_16: assert((offset & 1) == 0); - tcg_out_mem_long(s, 0, LVEHX, out, base, offset); + if (have_isa_3_00_vsx) { + tcg_out_mem_long(s, LXV | 8, LVX, out, base, offset & -16); + } else { + tcg_out_mem_long(s, 0, LVEHX, out, base, offset); + } elt = extract32(offset, 1, 3); #ifndef HOST_WORDS_BIGENDIAN elt ^= 7; @@ -3062,6 +3119,10 @@ static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, tcg_out32(s, VSPLTH | VRT(out) | VRB(out) | (elt << 16)); break; case MO_32: + if (have_isa_3_00_vsx) { + tcg_out_mem_long(s, 0, LXVWSX | 1, out, base, offset); + break; + } assert((offset & 3) == 0); tcg_out_mem_long(s, 0, LVEWX, out, base, offset); elt = extract32(offset, 2, 2); @@ -3101,7 +3162,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, static const uint32_t add_op[4] = { VADDUBM, VADDUHM, VADDUWM, VADDUDM }, sub_op[4] = { VSUBUBM, VSUBUHM, VSUBUWM, VSUBUDM }, + neg_op[4] = { 0, 0, VNEGW, VNEGD }, eq_op[4] = { VCMPEQUB, VCMPEQUH, VCMPEQUW, VCMPEQUD }, + ne_op[4] = { VCMPNEB, VCMPNEH, VCMPNEW, 0 }, gts_op[4] = { VCMPGTSB, VCMPGTSH, VCMPGTSW, VCMPGTSD }, gtu_op[4] = { VCMPGTUB, VCMPGTUH, VCMPGTUW, VCMPGTUD }, ssadd_op[4] = { VADDSBS, VADDSHS, VADDSWS, 0 }, @@ -3143,6 +3206,11 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_sub_vec: insn = sub_op[vece]; break; + case INDEX_op_neg_vec: + insn = neg_op[vece]; + a2 = a1; + a1 = 0; + break; case INDEX_op_mul_vec: tcg_debug_assert(vece == MO_32 && have_isa_2_07_vsx); insn = VMULUWM; @@ -3205,6 +3273,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case TCG_COND_EQ: insn = eq_op[vece]; break; + case TCG_COND_NE: + insn = ne_op[vece]; + break; case TCG_COND_GT: insn = gts_op[vece]; break; @@ -3287,6 +3358,10 @@ static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, case TCG_COND_GTU: break; case TCG_COND_NE: + if (have_isa_3_00_vsx && vece <= MO_32) { + break; + } + /* fall through */ case TCG_COND_LE: case TCG_COND_LEU: need_inv = true; @@ -3442,6 +3517,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) static const TCGTargetOpDef sub2 = { .args_ct_str = { "r", "r", "rI", "rZM", "r", "r" } }; static const TCGTargetOpDef v_r = { .args_ct_str = { "v", "r" } }; + static const TCGTargetOpDef v_vr = { .args_ct_str = { "v", "vr" } }; static const TCGTargetOpDef v_v = { .args_ct_str = { "v", "v" } }; static const TCGTargetOpDef v_v_v = { .args_ct_str = { "v", "v", "v" } }; static const TCGTargetOpDef v_v_v_v @@ -3610,8 +3686,10 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_dup2_vec: return &v_v_v; case INDEX_op_not_vec: - case INDEX_op_dup_vec: + case INDEX_op_neg_vec: return &v_v; + case INDEX_op_dup_vec: + return have_isa_3_00_vsx ? &v_vr : &v_v; case INDEX_op_ld_vec: case INDEX_op_st_vec: case INDEX_op_dupm_vec: @@ -3647,6 +3725,9 @@ static void tcg_target_init(TCGContext *s) #ifdef PPC_FEATURE2_ARCH_3_00 if (hwcap2 & PPC_FEATURE2_ARCH_3_00) { have_isa_3_00 = true; + if (hwcap & PPC_FEATURE_HAS_VSX) { + have_isa_3_00_vsx = true; + } } #endif