From patchwork Tue Feb 11 00:39:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 11374641 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 907EE92A for ; Tue, 11 Feb 2020 01:14:58 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F11A120708 for ; Tue, 11 Feb 2020 01:14:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=quicinc.com header.i=@quicinc.com header.b="WcoOax4q" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F11A120708 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=quicinc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:42014 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1K8f-0003ZS-3P for patchwork-qemu-devel@patchwork.kernel.org; Mon, 10 Feb 2020 20:14:57 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:33188) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1JcT-0001x7-Oo for qemu-devel@nongnu.org; Mon, 10 Feb 2020 19:41:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j1JcI-0007Aa-Rs for qemu-devel@nongnu.org; Mon, 10 Feb 2020 19:41:41 -0500 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:3651) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j1JcF-0004uE-Gu for qemu-devel@nongnu.org; Mon, 10 Feb 2020 19:41:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1581381687; x=1612917687; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tky/zdm+1G/nft1mb5tKRqeRo+5AyJEpH0yhf1J/HSw=; b=WcoOax4q06g8k/AKgbcH2NhytnN8DLd2z4atA8y4ty0U33kybdva6sva O/oG6btlbqYVvgko3QYSuKWKCclDB6yjV/zAPRT23bj1QVBna+UXBYra0 T8BU2bcx2w0bXNuevC+x9JbwyWFZM55u9U+6Zgxm1UUkYbhXRWWcv5OQl U=; Received: from unknown (HELO ironmsg05-sd.qualcomm.com) ([10.53.140.145]) by alexa-out-sd-01.qualcomm.com with ESMTP; 10 Feb 2020 16:41:00 -0800 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg05-sd.qualcomm.com with ESMTP; 10 Feb 2020 16:40:58 -0800 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id BA8E91B5F; Mon, 10 Feb 2020 18:40:58 -0600 (CST) From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [RFC PATCH 15/66] Hexagon arch import - instruction semantics definitions Date: Mon, 10 Feb 2020 18:39:53 -0600 Message-Id: <1581381644-13678-16-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1581381644-13678-1-git-send-email-tsimpson@quicinc.com> References: <1581381644-13678-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 199.106.114.38 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: riku.voipio@iki.fi, richard.henderson@linaro.org, laurent@vivier.eu, Taylor Simpson , philmd@redhat.com, aleksandar.m.mail@gmail.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Imported from the Hexagon architecture library imported/allidefs.def Top level instruction definition file imported/*.idef Instruction definition files These files are input to the first phase of the generator (gen_semantics.c) to create a python include file with the instruction semantics and attributes. The python include file is fed to the second phase (do_qemu.py). Signed-off-by: Taylor Simpson --- target/hexagon/imported/allidefs.def | 91 +++ target/hexagon/imported/alu.idef | 1335 +++++++++++++++++++++++++++++++++ target/hexagon/imported/branch.idef | 344 +++++++++ target/hexagon/imported/compare.idef | 639 ++++++++++++++++ target/hexagon/imported/float.idef | 498 ++++++++++++ target/hexagon/imported/ldst.idef | 421 +++++++++++ target/hexagon/imported/mpy.idef | 1269 +++++++++++++++++++++++++++++++ target/hexagon/imported/shift.idef | 1211 ++++++++++++++++++++++++++++++ target/hexagon/imported/subinsns.idef | 152 ++++ target/hexagon/imported/system.idef | 302 ++++++++ 10 files changed, 6262 insertions(+) create mode 100644 target/hexagon/imported/allidefs.def create mode 100644 target/hexagon/imported/alu.idef create mode 100644 target/hexagon/imported/branch.idef create mode 100644 target/hexagon/imported/compare.idef create mode 100644 target/hexagon/imported/float.idef create mode 100644 target/hexagon/imported/ldst.idef create mode 100644 target/hexagon/imported/mpy.idef create mode 100644 target/hexagon/imported/shift.idef create mode 100644 target/hexagon/imported/subinsns.idef create mode 100644 target/hexagon/imported/system.idef diff --git a/target/hexagon/imported/allidefs.def b/target/hexagon/imported/allidefs.def new file mode 100644 index 0000000..4864046 --- /dev/null +++ b/target/hexagon/imported/allidefs.def @@ -0,0 +1,91 @@ +/* + * Copyright (c) 2019 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/* + * Top level instruction definition file + */ + +#define DEF_MAPPING(TAG,FROMSYN,TOSYN) \ + Q6INSN(TAG,FROMSYN,ATTRIBS(A_MAPPING), \ + "Mapping from " FROMSYN " to " TOSYN, \ + fASM_MAP(FROMSYN,TOSYN)) + +#define DEF_COND_MAPPING(TAG,FROMSYN,COND,TOSYNA,TOSYNB) \ + Q6INSN(TAG,FROMSYN,ATTRIBS(A_MAPPING,A_CONDMAPPING), \ + "Conditional Mapping from " FROMSYN " to either " TOSYNA " or " TOSYNB, \ + fCOND_ASM_MAP(FROMSYN,COND,TOSYNA,TOSYNB)) + +#define DEF_V2_MAPPING(TAG,FROMSYN,TOSYN) \ + Q6INSN(TAG,FROMSYN,ATTRIBS(A_MAPPING,A_ARCHV2), \ + "Mapping from " FROMSYN " to " TOSYN, \ + fASM_MAP(FROMSYN,TOSYN)) + +#define DEF_V2_COND_MAPPING(TAG,FROMSYN,COND,TOSYNA,TOSYNB) \ + Q6INSN(TAG,FROMSYN,ATTRIBS(A_MAPPING,A_CONDMAPPING,A_ARCHV2), \ + "Conditional Mapping from " FROMSYN " to either " TOSYNA " or " TOSYNB, \ + fCOND_ASM_MAP(FROMSYN,COND,TOSYNA,TOSYNB)) + +#define DEF_V3_MAPPING(TAG,FROMSYN,TOSYN) \ + Q6INSN(TAG,FROMSYN,ATTRIBS(A_MAPPING,A_ARCHV3), \ + "Mapping from " FROMSYN " to " TOSYN, \ + fASM_MAP(FROMSYN,TOSYN)) + +#define DEF_V3_COND_MAPPING(TAG,FROMSYN,COND,TOSYNA,TOSYNB) \ + Q6INSN(TAG,FROMSYN,ATTRIBS(A_MAPPING,A_CONDMAPPING,A_ARCHV3), \ + "Conditional Mapping from " FROMSYN " to either " TOSYNA " or " TOSYNB, \ + fCOND_ASM_MAP(FROMSYN,COND,TOSYNA,TOSYNB)) + +#define DEF_V4_MAPPING(TAG,FROMSYN,TOSYN) \ + Q6INSN(TAG,FROMSYN,ATTRIBS(A_MAPPING,A_ARCHV4), \ + "Mapping from " FROMSYN " to " TOSYN, \ + fASM_MAP(FROMSYN,TOSYN)) + +#define DEF_V4_COND_MAPPING(TAG,FROMSYN,COND,TOSYNA,TOSYNB) \ + Q6INSN(TAG,FROMSYN,ATTRIBS(A_MAPPING,A_CONDMAPPING,A_ARCHV4), \ + "Conditional Mapping from " FROMSYN " to either " TOSYNA " or " TOSYNB, \ + fCOND_ASM_MAP(FROMSYN,COND,TOSYNA,TOSYNB)) + +#define DEF_V5_MAPPING(TAG,FROMSYN,TOSYN) \ + Q6INSN(TAG,FROMSYN,ATTRIBS(A_MAPPING,A_ARCHV5), \ + "Mapping from " FROMSYN " to " TOSYN, \ + fASM_MAP(FROMSYN,TOSYN)) + +#define DEF_V5_COND_MAPPING(TAG,FROMSYN,COND,TOSYNA,TOSYNB) \ + Q6INSN(TAG,FROMSYN,ATTRIBS(A_MAPPING,A_CONDMAPPING,A_ARCHV5), \ + "Conditional Mapping from " FROMSYN " to either " TOSYNA " or " TOSYNB, \ + fCOND_ASM_MAP(FROMSYN,COND,TOSYNA,TOSYNB)) + +#define DEF_VECX_MAPPING(TAG,FROMSYN,TOSYN) \ + EXTINSN(TAG,FROMSYN,ATTRIBS(A_MAPPING,A_VECX), \ + "VECX Mapping from " FROMSYN " to " TOSYN, \ + fASM_MAP(FROMSYN,TOSYN)) + +#define DEF_CVI_MAPPING(TAG,FROMSYN,TOSYN) \ + Q6INSN(TAG,FROMSYN,ATTRIBS(A_MAPPING,A_CVI), \ + "CVI Mapping from " FROMSYN " to " TOSYN, \ + fASM_MAP(FROMSYN,TOSYN)) + + +#include "branch.idef" +#include "ldst.idef" +#include "compare.idef" +#include "mpy.idef" +#include "alu.idef" +#include "float.idef" +#include "shift.idef" +#include "system.idef" +#include "subinsns.idef" diff --git a/target/hexagon/imported/alu.idef b/target/hexagon/imported/alu.idef new file mode 100644 index 0000000..df22a60 --- /dev/null +++ b/target/hexagon/imported/alu.idef @@ -0,0 +1,1335 @@ +/* + * Copyright (c) 2019 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/* + * ALU Instructions + */ + + +/**********************************************/ +/* Add/Sub instructions */ +/**********************************************/ + +Q6INSN(A2_add,"Rd32=add(Rs32,Rt32)",ATTRIBS(), +"Add 32-bit registers", +{ RdV=RsV+RtV;}) + +Q6INSN(A2_sub,"Rd32=sub(Rt32,Rs32)",ATTRIBS(), +"Subtract 32-bit registers", +{ RdV=RtV-RsV;}) + +#define COND_ALU(TAG,OPER,DESCR,SEMANTICS)\ +Q6INSN(TAG##t,"if (Pu4) "OPER,ATTRIBS(A_ARCHV2),DESCR,{if(fLSBOLD(PuV)){SEMANTICS;} else {CANCEL;}})\ +Q6INSN(TAG##f,"if (!Pu4) "OPER,ATTRIBS(A_ARCHV2),DESCR,{if(fLSBOLDNOT(PuV)){SEMANTICS;} else {CANCEL;}})\ +Q6INSN(TAG##tnew,"if (Pu4.new) " OPER,ATTRIBS(A_ARCHV2),DESCR,{if(fLSBNEW(PuN)){SEMANTICS;} else {CANCEL;}})\ +Q6INSN(TAG##fnew,"if (!Pu4.new) "OPER,ATTRIBS(A_ARCHV2),DESCR,{if(fLSBNEWNOT(PuN)){SEMANTICS;} else {CANCEL;}}) + +COND_ALU(A2_padd,"Rd32=add(Rs32,Rt32)","Conditionally Add 32-bit registers",RdV=RsV+RtV) +COND_ALU(A2_psub,"Rd32=sub(Rt32,Rs32)","Conditionally Subtract 32-bit registers",RdV=RtV-RsV) +COND_ALU(A2_paddi,"Rd32=add(Rs32,#s8)","Conditionally Add Register and immediate",fIMMEXT(siV); RdV=RsV+siV) +COND_ALU(A2_pxor,"Rd32=xor(Rs32,Rt32)","Conditionally XOR registers",RdV=RsV^RtV); +COND_ALU(A2_pand,"Rd32=and(Rs32,Rt32)","Conditionally AND registers",RdV=RsV&RtV); +COND_ALU(A2_por,"Rd32=or(Rs32,Rt32)","Conditionally OR registers",RdV=RsV|RtV); + +COND_ALU(A4_psxtb,"Rd32=sxtb(Rs32)","Conditionally sign-extend byte", RdV=fSXTN(8,32,RsV)); +COND_ALU(A4_pzxtb,"Rd32=zxtb(Rs32)","Conditionally zero-extend byte", RdV=fZXTN(8,32,RsV)); +COND_ALU(A4_psxth,"Rd32=sxth(Rs32)","Conditionally sign-extend halfword", RdV=fSXTN(16,32,RsV)); +COND_ALU(A4_pzxth,"Rd32=zxth(Rs32)","Conditionally zero-extend halfword", RdV=fZXTN(16,32,RsV)); +COND_ALU(A4_paslh,"Rd32=aslh(Rs32)","Conditionally zero-extend halfword", RdV=RsV<<16); +COND_ALU(A4_pasrh,"Rd32=asrh(Rs32)","Conditionally zero-extend halfword", RdV=RsV>>16); + + +DEF_V2_MAPPING(A2_tfrt, "if (Pu4) Rd32=Rs32","if (Pu4) Rd32=add(Rs32,#0)") +DEF_V2_MAPPING(A2_tfrf, "if (!Pu4) Rd32=Rs32","if (!Pu4) Rd32=add(Rs32,#0)") +DEF_V2_MAPPING(A2_tfrtnew,"if (Pu4.new) Rd32=Rs32","if (Pu4.new) Rd32=add(Rs32,#0)") +DEF_V2_MAPPING(A2_tfrfnew,"if (!Pu4.new) Rd32=Rs32","if (!Pu4.new) Rd32=add(Rs32,#0)") + + +DEF_V2_MAPPING(A2_tfrpt, "if (Pu4) Rdd32=Rss32","if (Pu4) Rdd32=combine(Rss.H32,Rss.L32)") +DEF_V2_MAPPING(A2_tfrpf, "if (!Pu4) Rdd32=Rss32","if (!Pu4) Rdd32=combine(Rss.H32,Rss.L32)") +DEF_V2_MAPPING(A2_tfrptnew,"if (Pu4.new) Rdd32=Rss32","if (Pu4.new) Rdd32=combine(Rss.H32,Rss.L32)") +DEF_V2_MAPPING(A2_tfrpfnew,"if (!Pu4.new) Rdd32=Rss32","if (!Pu4.new) Rdd32=combine(Rss.H32,Rss.L32)") + + +Q6INSN(A2_addsat,"Rd32=add(Rs32,Rt32):sat",ATTRIBS(), +"Add 32-bit registers with saturation", +{ RdV=fSAT(fSE32_64(RsV)+fSE32_64(RtV)); }) + +Q6INSN(A2_subsat,"Rd32=sub(Rt32,Rs32):sat",ATTRIBS(), +"Subtract 32-bit registers with saturation", +{ RdV=fSAT(fSE32_64(RtV) - fSE32_64(RsV)); }) + + +Q6INSN(A2_addi,"Rd32=add(Rs32,#s16)",ATTRIBS(), +"Add a signed immediate to a register", +{ fIMMEXT(siV); RdV=RsV+siV;}) + + +Q6INSN(C4_addipc,"Rd32=add(pc,#u6)",ATTRIBS(), +"Add immediate to PC", +{ RdV=fREAD_PC()+fIMMEXT(uiV);}) + + + +/**********************************************/ +/* Single-precision HL forms */ +/* These insns and the SP mpy are the ones */ +/* that can do .HL stuff */ +/**********************************************/ +#define STD_HL_INSN(TAG,OPER,AOPER,ATR,SEM)\ +Q6INSN(A2_##TAG##_ll, OPER"(Rt.L32,Rs.L32)"AOPER, ATR,"",{SEM(fGETHALF(0,RtV),fGETHALF(0,RsV));})\ +Q6INSN(A2_##TAG##_lh, OPER"(Rt.L32,Rs.H32)"AOPER, ATR,"",{SEM(fGETHALF(0,RtV),fGETHALF(1,RsV));})\ +Q6INSN(A2_##TAG##_hl, OPER"(Rt.H32,Rs.L32)"AOPER, ATR,"",{SEM(fGETHALF(1,RtV),fGETHALF(0,RsV));})\ +Q6INSN(A2_##TAG##_hh, OPER"(Rt.H32,Rs.H32)"AOPER, ATR,"",{SEM(fGETHALF(1,RtV),fGETHALF(1,RsV));}) + +#define SUBSTD_HL_INSN(TAG,OPER,AOPER,ATR,SEM)\ +Q6INSN(A2_##TAG##_ll, OPER"(Rt.L32,Rs.L32)"AOPER, ATR,"",{SEM(fGETHALF(0,RtV),fGETHALF(0,RsV));})\ +Q6INSN(A2_##TAG##_hl, OPER"(Rt.L32,Rs.H32)"AOPER, ATR,"",{SEM(fGETHALF(0,RtV),fGETHALF(1,RsV));}) + + +#undef HLSEM +#define HLSEM(A,B) RdV=fSXTN(16,32,(A+B)) +SUBSTD_HL_INSN(addh_l16,"Rd32=add","",ATTRIBS(),HLSEM) + +#undef HLSEM +#define HLSEM(A,B) RdV=fSATH(A+B) +SUBSTD_HL_INSN(addh_l16_sat,"Rd32=add",":sat",ATTRIBS(),HLSEM) + +#undef HLSEM +#define HLSEM(A,B) RdV=fSXTN(16,32,(A-B)) +SUBSTD_HL_INSN(subh_l16,"Rd32=sub","",ATTRIBS(),HLSEM) + +#undef HLSEM +#define HLSEM(A,B) RdV=fSATH(A-B) +SUBSTD_HL_INSN(subh_l16_sat,"Rd32=sub",":sat",ATTRIBS(),HLSEM) + +#undef HLSEM +#define HLSEM(A,B) RdV=(A+B)<<16 +STD_HL_INSN(addh_h16,"Rd32=add",":<<16",ATTRIBS(),HLSEM) + +#undef HLSEM +#define HLSEM(A,B) RdV=(fSATH(A+B))<<16 +STD_HL_INSN(addh_h16_sat,"Rd32=add",":sat:<<16",ATTRIBS(),HLSEM) + +#undef HLSEM +#define HLSEM(A,B) RdV=(A-B)<<16 +STD_HL_INSN(subh_h16,"Rd32=sub",":<<16",ATTRIBS(),HLSEM) + +#undef HLSEM +#define HLSEM(A,B) RdV=(fSATH(A-B))<<16 +STD_HL_INSN(subh_h16_sat,"Rd32=sub",":sat:<<16",ATTRIBS(),HLSEM) + + + + +Q6INSN(A2_aslh,"Rd32=aslh(Rs32)",ATTRIBS(), +"Arithmetic Shift Left by Halfword",{ RdV=RsV<<16; }) + +Q6INSN(A2_asrh,"Rd32=asrh(Rs32)",ATTRIBS(), +"Arithmetic Shift Right by Halfword",{ RdV=RsV>>16; }) + + +/* 64-bit versions */ + +Q6INSN(A2_addp,"Rdd32=add(Rss32,Rtt32)",ATTRIBS(), +"Add", +{ RddV=RssV+RttV;}) + +Q6INSN(A2_addpsat,"Rdd32=add(Rss32,Rtt32):sat",ATTRIBS(A_ARCHV3), +"Add", +{ fADDSAT64(RddV,RssV,RttV);}) + +Q6INSN(A2_addspl,"Rdd32=add(Rss32,Rtt32):raw:lo",ATTRIBS(A_ARCHV3), +"Add", +{ RddV=RttV+fSXTN(32,64,fGETWORD(0,RssV));}) + +Q6INSN(A2_addsph,"Rdd32=add(Rss32,Rtt32):raw:hi",ATTRIBS(A_ARCHV3), +"Add", +{ RddV=RttV+fSXTN(32,64,fGETWORD(1,RssV));}) + +DEF_V3_COND_MAPPING(A2_addsp,"Rdd32=add(Rs32,Rtt32)","Rs32 & 1","Rdd32=add(Rss32,Rtt32):raw:hi","Rdd32=add(Rss32,Rtt32):raw:lo") + +Q6INSN(A2_subp,"Rdd32=sub(Rtt32,Rss32)",ATTRIBS(), +"Sub", +{ RddV=RttV-RssV;}) + +/* 64-bit with carry */ + +Q6INSN(A4_addp_c,"Rdd32=add(Rss32,Rtt32,Px4):carry",ATTRIBS(A_RESTRICT_LATEPRED,A_NOTE_LATEPRED),"Add with Carry", +{ + fPREDUSE_TIMING(); + RddV = RssV + RttV + fLSBOLD(PxV); + PxV = f8BITSOF(fCARRY_FROM_ADD(RssV,RttV,fLSBOLD(PxV))); + fHIDE(MARK_LATE_PRED_WRITE(PxN)) +}) + +Q6INSN(A4_subp_c,"Rdd32=sub(Rss32,Rtt32,Px4):carry",ATTRIBS(A_RESTRICT_LATEPRED,A_NOTE_LATEPRED),"Sub with Carry", +{ + fPREDUSE_TIMING(); + RddV = RssV + ~RttV + fLSBOLD(PxV); + PxV = f8BITSOF(fCARRY_FROM_ADD(RssV,~RttV,fLSBOLD(PxV))); + fHIDE(MARK_LATE_PRED_WRITE(PxN)) +}) + + +/* NEG and ABS */ + +DEF_MAPPING(A2_neg,"Rd32=neg(Rs32)","Rd32=sub(#0,Rs32)") + + +Q6INSN(A2_negsat,"Rd32=neg(Rs32):sat",ATTRIBS(), +"Arithmetic negate register", { RdV = fSAT(-fCAST8s(RsV)); }) + +Q6INSN(A2_abs,"Rd32=abs(Rs32)",ATTRIBS(), +"Absolute Value register", { RdV = fABS(RsV); }) + +Q6INSN(A2_abssat,"Rd32=abs(Rs32):sat",ATTRIBS(), +"Arithmetic negate register", { RdV = fSAT(fABS(fCAST4_8s(RsV))); }) + +Q6INSN(A2_vconj,"Rdd32=vconj(Rss32):sat",ATTRIBS(A_ARCHV2), +"Vector Complex conjugate of Rss", +{ fSETHALF(1,RddV,fSATN(16,-fGETHALF(1,RssV))); + fSETHALF(0,RddV,fGETHALF(0,RssV)); + fSETHALF(3,RddV,fSATN(16,-fGETHALF(3,RssV))); + fSETHALF(2,RddV,fGETHALF(2,RssV)); +}) + + +/* 64-bit versions */ + +Q6INSN(A2_negp,"Rdd32=neg(Rss32)",ATTRIBS(), +"Arithmetic negate register", { RddV = -RssV; }) + +Q6INSN(A2_absp,"Rdd32=abs(Rss32)",ATTRIBS(), +"Absolute Value register", { RddV = fABS(RssV); }) + + +/* MIN and MAX R */ + +Q6INSN(A2_max,"Rd32=max(Rs32,Rt32)",ATTRIBS(), +"Maximum of two registers", +{ RdV = fMAX(RsV,RtV); }) + +Q6INSN(A2_maxu,"Rd32=maxu(Rs32,Rt32)",ATTRIBS(A_INTRINSIC_RETURNS_UNSIGNED), +"Maximum of two registers (unsigned)", +{ RdV = fMAX(fCAST4u(RsV),fCAST4u(RtV)); }) + +Q6INSN(A2_min,"Rd32=min(Rt32,Rs32)",ATTRIBS(), +"Minimum of two registers", +{ RdV = fMIN(RtV,RsV); }) + +Q6INSN(A2_minu,"Rd32=minu(Rt32,Rs32)",ATTRIBS(A_INTRINSIC_RETURNS_UNSIGNED), +"Minimum of two registers (unsigned)", +{ RdV = fMIN(fCAST4u(RtV),fCAST4u(RsV)); }) + +/* MIN and MAX Pairs */ +#if 1 +Q6INSN(A2_maxp,"Rdd32=max(Rss32,Rtt32)",ATTRIBS(A_ARCHV3), +"Maximum of two register pairs", +{ RddV = fMAX(RssV,RttV); }) + +Q6INSN(A2_maxup,"Rdd32=maxu(Rss32,Rtt32)",ATTRIBS(A_INTRINSIC_RETURNS_UNSIGNED,A_ARCHV3), +"Maximum of two register pairs (unsigned)", +{ RddV = fMAX(fCAST8u(RssV),fCAST8u(RttV)); }) + +Q6INSN(A2_minp,"Rdd32=min(Rtt32,Rss32)",ATTRIBS(A_ARCHV3), +"Minimum of two register pairs", +{ RddV = fMIN(RttV,RssV); }) + +Q6INSN(A2_minup,"Rdd32=minu(Rtt32,Rss32)",ATTRIBS(A_INTRINSIC_RETURNS_UNSIGNED,A_ARCHV3), +"Minimum of two register pairs (unsigned)", +{ RddV = fMIN(fCAST8u(RttV),fCAST8u(RssV)); }) +#endif + +/**********************************************/ +/* Register and Immediate Transfers */ +/**********************************************/ + +Q6INSN(A2_nop,"nop",ATTRIBS(A_IT_NOP), +"Nop (32-bit encoding)", + fHIDE( { fNOP_EXECUTED } )) + + +Q6INSN(A4_ext,"immext(#u26:6)",ATTRIBS(A_IT_EXTENDER), +"This instruction carries the 26 most-significant immediate bits for the next instruction", +{ fHIDE(); }) + + +Q6INSN(A2_tfr,"Rd32=Rs32",ATTRIBS(), +"tfr register",{ RdV=RsV;}) + +Q6INSN(A2_tfrsi,"Rd32=#s16",ATTRIBS(), +"transfer signed immediate to register",{ fIMMEXT(siV); RdV=siV;}) + +DEF_MAPPING(A2_tfrp,"Rdd32=Rss32","Rdd32=combine(Rss.H32,Rss.L32)") +DEF_V2_COND_MAPPING(A2_tfrpi,"Rdd32=#s8","#s8<0","Rdd32=combine(#-1,#s8)","Rdd32=combine(#0,#s8)") +DEF_MAPPING(A2_zxtb,"Rd32=zxtb(Rs32)","Rd32=and(Rs32,#255)") + +Q6INSN(A2_sxtb,"Rd32=sxtb(Rs32)",ATTRIBS(), +"Sign extend byte", {RdV = fSXTN(8,32,RsV);}) + +Q6INSN(A2_zxth,"Rd32=zxth(Rs32)",ATTRIBS(), +"Zero extend half", {RdV = fZXTN(16,32,RsV);}) + +Q6INSN(A2_sxth,"Rd32=sxth(Rs32)",ATTRIBS(), +"Sign extend half", {RdV = fSXTN(16,32,RsV);}) + +Q6INSN(A2_combinew,"Rdd32=combine(Rs32,Rt32)",ATTRIBS(A_ROPS_2), +"Combine two words into a register pair", +{ fSETWORD(0,RddV,RtV); + fSETWORD(1,RddV,RsV); +}) + +Q6INSN(A4_combineri,"Rdd32=combine(Rs32,#s8)",ATTRIBS(A_ROPS_2), +"Combine a word and an immediate into a register pair", +{ fIMMEXT(siV); fSETWORD(0,RddV,siV); + fSETWORD(1,RddV,RsV); +}) + +Q6INSN(A4_combineir,"Rdd32=combine(#s8,Rs32)",ATTRIBS(A_ROPS_2), +"Combine a word and an immediate into a register pair", +{ fIMMEXT(siV); fSETWORD(0,RddV,RsV); + fSETWORD(1,RddV,siV); +}) + + + +Q6INSN(A2_combineii,"Rdd32=combine(#s8,#S8)",ATTRIBS(A_ARCHV2,A_ROPS_2), +"Set two small immediates", +{ fIMMEXT(siV); fSETWORD(0,RddV,SiV); fSETWORD(1,RddV,siV); }) + +Q6INSN(A4_combineii,"Rdd32=combine(#s8,#U6)",ATTRIBS(A_ROPS_2),"Set two small immediates", +{ fIMMEXT(UiV); fSETWORD(0,RddV,UiV); fSETWORD(1,RddV,siV); }) + + +Q6INSN(A2_combine_hh,"Rd32=combine(Rt.H32,Rs.H32)",ATTRIBS(), +"Combine two halfs into a register", {RdV = (fGETUHALF(1,RtV)<<16) | fGETUHALF(1,RsV);}) + +Q6INSN(A2_combine_hl,"Rd32=combine(Rt.H32,Rs.L32)",ATTRIBS(), +"Combine two halfs into a register", {RdV = (fGETUHALF(1,RtV)<<16) | fGETUHALF(0,RsV);}) + +Q6INSN(A2_combine_lh,"Rd32=combine(Rt.L32,Rs.H32)",ATTRIBS(), +"Combine two halfs into a register", {RdV = (fGETUHALF(0,RtV)<<16) | fGETUHALF(1,RsV);}) + +Q6INSN(A2_combine_ll,"Rd32=combine(Rt.L32,Rs.L32)",ATTRIBS(), +"Combine two halfs into a register", {RdV = (fGETUHALF(0,RtV)<<16) | fGETUHALF(0,RsV);}) + +Q6INSN(A2_tfril,"Rx.L32=#u16",ATTRIBS(), +"Set low 16-bits, leave upper 16 unchanged",{ fSETHALF(0,RxV,uiV);}) + +Q6INSN(A2_tfrih,"Rx.H32=#u16",ATTRIBS(), +"Set high 16-bits, leave low 16 unchanged",{ fSETHALF(1,RxV,uiV);}) + +Q6INSN(A2_tfrcrr,"Rd32=Cs32",ATTRIBS(), +"transfer control register to general register",{ RdV=CsV;}) + +Q6INSN(A2_tfrrcr,"Cd32=Rs32",ATTRIBS(), +"transfer general register to control register",{ CdV=RsV;}) + +Q6INSN(A4_tfrcpp,"Rdd32=Css32",ATTRIBS(), +"transfer control register to general register",{ RddV=CssV;}) + +Q6INSN(A4_tfrpcp,"Cdd32=Rss32",ATTRIBS(), +"transfer general register to control register",{ CddV=RssV;}) + + +/**********************************************/ +/* Logicals */ +/**********************************************/ + +Q6INSN(A2_and,"Rd32=and(Rs32,Rt32)",ATTRIBS(), +"logical AND",{ RdV=RsV&RtV;}) + +Q6INSN(A2_or,"Rd32=or(Rs32,Rt32)",ATTRIBS(), +"logical OR",{ RdV=RsV|RtV;}) + +Q6INSN(A2_xor,"Rd32=xor(Rs32,Rt32)",ATTRIBS(), +"logical XOR",{ RdV=RsV^RtV;}) + +DEF_MAPPING(A2_not,"Rd32=not(Rs32)","Rd32=sub(#-1,Rs32)") + +Q6INSN(M2_xor_xacc,"Rx32^=xor(Rs32,Rt32)",ATTRIBS(A_ARCHV2), +"logical XOR with XOR accumulation",{ RxV^=RsV^RtV;}) + +Q6INSN(M4_xor_xacc,"Rxx32^=xor(Rss32,Rtt32)",, +"logical XOR with XOR accumulation",{ RxxV^=RssV^RttV;}) + + + +Q6INSN(A4_andn,"Rd32=and(Rt32,~Rs32)",, +"And-Not", { RdV = (RtV & ~RsV); }) + +Q6INSN(A4_orn,"Rd32=or(Rt32,~Rs32)",, +"Or-Not", { RdV = (RtV | ~RsV); }) + + +Q6INSN(A4_andnp,"Rdd32=and(Rtt32,~Rss32)",, +"And-Not", { RddV = (RttV & ~RssV); }) + +Q6INSN(A4_ornp,"Rdd32=or(Rtt32,~Rss32)",, +"Or-Not", { RddV = (RttV | ~RssV); }) + + + + +/********************/ +/* Compound add-add */ +/********************/ + +Q6INSN(S4_addaddi,"Rd32=add(Rs32,add(Ru32,#s6))",ATTRIBS(A_ROPS_2), + "3-input add", + { RdV = RsV + RuV + fIMMEXT(siV); }) + + +Q6INSN(S4_subaddi,"Rd32=add(Rs32,sub(#s6,Ru32))",ATTRIBS(A_ROPS_2), + "3-input sub", + { RdV = RsV - RuV + fIMMEXT(siV); }) + + + +/****************************/ +/* Compound logical-logical */ +/****************************/ + +Q6INSN(M4_and_and,"Rx32&=and(Rs32,Rt32)",ATTRIBS(A_ROPS_2), +"Compound And-And", { RxV &= (RsV & RtV); }) + +Q6INSN(M4_and_andn,"Rx32&=and(Rs32,~Rt32)",ATTRIBS(A_ROPS_2), +"Compound And-Andn", { RxV &= (RsV & ~RtV); }) + +Q6INSN(M4_and_or,"Rx32&=or(Rs32,Rt32)",ATTRIBS(A_ROPS_2), +"Compound And-Or", { RxV &= (RsV | RtV); }) + +Q6INSN(M4_and_xor,"Rx32&=xor(Rs32,Rt32)",ATTRIBS(A_ROPS_2), +"Compound And-xor", { RxV &= (RsV ^ RtV); }) + + + +Q6INSN(M4_or_and,"Rx32|=and(Rs32,Rt32)",ATTRIBS(A_ROPS_2), +"Compound Or-And", { RxV |= (RsV & RtV); }) + +Q6INSN(M4_or_andn,"Rx32|=and(Rs32,~Rt32)",ATTRIBS(A_ROPS_2), +"Compound Or-AndN", { RxV |= (RsV & ~RtV); }) + +Q6INSN(M4_or_or,"Rx32|=or(Rs32,Rt32)",ATTRIBS(A_ROPS_2), +"Compound Or-Or", { RxV |= (RsV | RtV); }) + +Q6INSN(M4_or_xor,"Rx32|=xor(Rs32,Rt32)",ATTRIBS(A_ROPS_2), +"Compound Or-xor", { RxV |= (RsV ^ RtV); }) + + +Q6INSN(S4_or_andix,"Rx32=or(Ru32,and(Rx32,#s10))",ATTRIBS(A_ROPS_2), +"Compound Or-And", { RxV = RuV | (RxV & fIMMEXT(siV)); }) + +Q6INSN(S4_or_andi,"Rx32|=and(Rs32,#s10)",ATTRIBS(A_ROPS_2), +"Compound Or-And", { RxV = RxV | (RsV & fIMMEXT(siV)); }) + +Q6INSN(S4_or_ori,"Rx32|=or(Rs32,#s10)",ATTRIBS(A_ROPS_2), +"Compound Or-And", { RxV = RxV | (RsV | fIMMEXT(siV)); }) + + + + +Q6INSN(M4_xor_and,"Rx32^=and(Rs32,Rt32)",ATTRIBS(A_ROPS_2), +"Compound Xor-And", { RxV ^= (RsV & RtV); }) + +Q6INSN(M4_xor_or,"Rx32^=or(Rs32,Rt32)",ATTRIBS(A_ROPS_2), +"Compound Xor-Or", { RxV ^= (RsV | RtV); }) + +Q6INSN(M4_xor_andn,"Rx32^=and(Rs32,~Rt32)",ATTRIBS(A_ROPS_2), +"Compound Xor-And", { RxV ^= (RsV & ~RtV); }) + + + + + + +Q6INSN(A2_subri,"Rd32=sub(#s10,Rs32)",ATTRIBS(A_ARCHV2), +"Subtract register from immediate",{ fIMMEXT(siV); RdV=siV-RsV;}) + +Q6INSN(A2_andir,"Rd32=and(Rs32,#s10)",ATTRIBS(A_ARCHV2), +"logical AND with immediate",{ fIMMEXT(siV); RdV=RsV&siV;}) + +Q6INSN(A2_orir,"Rd32=or(Rs32,#s10)",ATTRIBS(A_ARCHV2), +"logical OR with immediate",{ fIMMEXT(siV); RdV=RsV|siV;}) + + + + +Q6INSN(A2_andp,"Rdd32=and(Rss32,Rtt32)",ATTRIBS(), +"logical AND pair",{ RddV=RssV&RttV;}) + +Q6INSN(A2_orp,"Rdd32=or(Rss32,Rtt32)",ATTRIBS(), +"logical OR pair",{ RddV=RssV|RttV;}) + +Q6INSN(A2_xorp,"Rdd32=xor(Rss32,Rtt32)",ATTRIBS(), +"logical eXclusive OR pair",{ RddV=RssV^RttV;}) + +Q6INSN(A2_notp,"Rdd32=not(Rss32)",ATTRIBS(), +"logical NOT pair",{ RddV=~RssV;}) + +Q6INSN(A2_sxtw,"Rdd32=sxtw(Rs32)",ATTRIBS(), +"Sign extend 32-bit word to 64-bit pair", +{ RddV = fCAST4_8s(RsV); }) + +Q6INSN(A2_sat,"Rd32=sat(Rss32)",ATTRIBS(), +"Saturate to 32-bit Signed", +{ RdV = fSAT(RssV); }) + +Q6INSN(A2_roundsat,"Rd32=round(Rss32):sat",ATTRIBS(), +"Round & Saturate to 32-bit Signed", +{ fHIDE(size8s_t tmp;) fADDSAT64(tmp,RssV,0x080000000ULL); RdV = fGETWORD(1,tmp); }) + +Q6INSN(A2_sath,"Rd32=sath(Rs32)",ATTRIBS(), +"Saturate to 16-bit Signed", +{ RdV = fSATH(RsV); }) + +Q6INSN(A2_satuh,"Rd32=satuh(Rs32)",ATTRIBS(), +"Saturate to 16-bit Unsigned", +{ RdV = fSATUH(RsV); }) + +Q6INSN(A2_satub,"Rd32=satub(Rs32)",ATTRIBS(), +"Saturate to 8-bit Unsigned", +{ RdV = fSATUB(RsV); }) + +Q6INSN(A2_satb,"Rd32=satb(Rs32)",ATTRIBS(A_ARCHV2), +"Saturate to 8-bit Signed", +{ RdV = fSATB(RsV); }) + +/**********************************************/ +/* Vector Add */ +/**********************************************/ + +Q6INSN(A2_vaddub,"Rdd32=vaddub(Rss32,Rtt32)",ATTRIBS(), +"Add vector of bytes", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBYTE(i,RddV,(fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV))); + } +}) +DEF_MAPPING(A2_vaddb_map,"Rdd32=vaddb(Rss32,Rtt32)","Rdd32=vaddub(Rss32,Rtt32)") + +Q6INSN(A2_vaddubs,"Rdd32=vaddub(Rss32,Rtt32):sat",ATTRIBS(), +"Add vector of bytes", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBYTE(i,RddV,fSATUN(8,fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV))); + } +}) + +Q6INSN(A2_vaddh,"Rdd32=vaddh(Rss32,Rtt32)",ATTRIBS(), +"Add vector of half integers", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,fGETHALF(i,RssV)+fGETHALF(i,RttV)); + } +}) + +Q6INSN(A2_vaddhs,"Rdd32=vaddh(Rss32,Rtt32):sat",ATTRIBS(), +"Add vector of half integers with saturation", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,fSATN(16,fGETHALF(i,RssV)+fGETHALF(i,RttV))); + } +}) + +Q6INSN(A2_vadduhs,"Rdd32=vadduh(Rss32,Rtt32):sat",ATTRIBS(), +"Add vector of unsigned half integers with saturation", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,fSATUN(16,fGETUHALF(i,RssV)+fGETUHALF(i,RttV))); + } +}) + +Q6INSN(A5_vaddhubs,"Rd32=vaddhub(Rss32,Rtt32):sat",ATTRIBS(), +"Add vector of half integers with saturation and pack to unsigned bytes", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETBYTE(i,RdV,fSATUB(fGETHALF(i,RssV)+fGETHALF(i,RttV))); + } +}) + +Q6INSN(A2_vaddw,"Rdd32=vaddw(Rss32,Rtt32)",ATTRIBS(), +"Add vector of words", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,fGETWORD(i,RssV)+fGETWORD(i,RttV)); + } +}) + +Q6INSN(A2_vaddws,"Rdd32=vaddw(Rss32,Rtt32):sat",ATTRIBS(), +"Add vector of words with saturation", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,fSATN(32,fGETWORD(i,RssV)+fGETWORD(i,RttV))); + } +}) + + + +Q6INSN(S4_vxaddsubw,"Rdd32=vxaddsubw(Rss32,Rtt32):sat",ATTRIBS(), +"Cross vector add-sub words with saturation", +{ + fSETWORD(0,RddV,fSAT(fGETWORD(0,RssV)+fGETWORD(1,RttV))); + fSETWORD(1,RddV,fSAT(fGETWORD(1,RssV)-fGETWORD(0,RttV))); +}) +Q6INSN(S4_vxsubaddw,"Rdd32=vxsubaddw(Rss32,Rtt32):sat",ATTRIBS(), +"Cross vector sub-add words with saturation", +{ + fSETWORD(0,RddV,fSAT(fGETWORD(0,RssV)-fGETWORD(1,RttV))); + fSETWORD(1,RddV,fSAT(fGETWORD(1,RssV)+fGETWORD(0,RttV))); +}) + + + +Q6INSN(S4_vxaddsubh,"Rdd32=vxaddsubh(Rss32,Rtt32):sat",ATTRIBS(), +"Cross vector add-sub halfwords with saturation", +{ + fSETHALF(0,RddV,fSATH(fGETHALF(0,RssV)+fGETHALF(1,RttV))); + fSETHALF(1,RddV,fSATH(fGETHALF(1,RssV)-fGETHALF(0,RttV))); + + fSETHALF(2,RddV,fSATH(fGETHALF(2,RssV)+fGETHALF(3,RttV))); + fSETHALF(3,RddV,fSATH(fGETHALF(3,RssV)-fGETHALF(2,RttV))); + +}) +Q6INSN(S4_vxsubaddh,"Rdd32=vxsubaddh(Rss32,Rtt32):sat",ATTRIBS(), +"Cross vector sub-add halfwords with saturation", +{ + fSETHALF(0,RddV,fSATH(fGETHALF(0,RssV)-fGETHALF(1,RttV))); + fSETHALF(1,RddV,fSATH(fGETHALF(1,RssV)+fGETHALF(0,RttV))); + + fSETHALF(2,RddV,fSATH(fGETHALF(2,RssV)-fGETHALF(3,RttV))); + fSETHALF(3,RddV,fSATH(fGETHALF(3,RssV)+fGETHALF(2,RttV))); +}) + + + + +Q6INSN(S4_vxaddsubhr,"Rdd32=vxaddsubh(Rss32,Rtt32):rnd:>>1:sat",ATTRIBS(), +"Cross vector add-sub halfwords with shift, round, and saturation", +{ + fSETHALF(0,RddV,fSATH((fGETHALF(0,RssV)+fGETHALF(1,RttV)+1)>>1)); + fSETHALF(1,RddV,fSATH((fGETHALF(1,RssV)-fGETHALF(0,RttV)+1)>>1)); + + fSETHALF(2,RddV,fSATH((fGETHALF(2,RssV)+fGETHALF(3,RttV)+1)>>1)); + fSETHALF(3,RddV,fSATH((fGETHALF(3,RssV)-fGETHALF(2,RttV)+1)>>1)); + +}) +Q6INSN(S4_vxsubaddhr,"Rdd32=vxsubaddh(Rss32,Rtt32):rnd:>>1:sat",ATTRIBS(), +"Cross vector sub-add halfwords with shift, round, and saturation", +{ + fSETHALF(0,RddV,fSATH((fGETHALF(0,RssV)-fGETHALF(1,RttV)+1)>>1)); + fSETHALF(1,RddV,fSATH((fGETHALF(1,RssV)+fGETHALF(0,RttV)+1)>>1)); + + fSETHALF(2,RddV,fSATH((fGETHALF(2,RssV)-fGETHALF(3,RttV)+1)>>1)); + fSETHALF(3,RddV,fSATH((fGETHALF(3,RssV)+fGETHALF(2,RttV)+1)>>1)); +}) + + + + + +/**********************************************/ +/* 1/2 Vector operations */ +/**********************************************/ + + +Q6INSN(A2_svavgh,"Rd32=vavgh(Rs32,Rt32)",ATTRIBS(A_ARCHV2), +"Avg vector of half integers", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETHALF(i,RdV,((fGETHALF(i,RsV)+fGETHALF(i,RtV))>>1)); + } +}) + +Q6INSN(A2_svavghs,"Rd32=vavgh(Rs32,Rt32):rnd",ATTRIBS(A_ARCHV2), +"Avg vector of half integers with rounding", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETHALF(i,RdV,((fGETHALF(i,RsV)+fGETHALF(i,RtV)+1)>>1)); + } +}) + + + +Q6INSN(A2_svnavgh,"Rd32=vnavgh(Rt32,Rs32)",ATTRIBS(A_ARCHV2), +"Avg vector of half integers", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETHALF(i,RdV,((fGETHALF(i,RtV)-fGETHALF(i,RsV))>>1)); + } +}) + + +Q6INSN(A2_svaddh,"Rd32=vaddh(Rs32,Rt32)",ATTRIBS(), +"Add vector of half integers", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETHALF(i,RdV,fGETHALF(i,RsV)+fGETHALF(i,RtV)); + } +}) + +Q6INSN(A2_svaddhs,"Rd32=vaddh(Rs32,Rt32):sat",ATTRIBS(), +"Add vector of half integers with saturation", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETHALF(i,RdV,fSATN(16,fGETHALF(i,RsV)+fGETHALF(i,RtV))); + } +}) + +Q6INSN(A2_svadduhs,"Rd32=vadduh(Rs32,Rt32):sat",ATTRIBS(), +"Add vector of unsigned half integers with saturation", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETHALF(i,RdV,fSATUN(16,fGETUHALF(i,RsV)+fGETUHALF(i,RtV))); + } +}) + + +Q6INSN(A2_svsubh,"Rd32=vsubh(Rt32,Rs32)",ATTRIBS(), +"Sub vector of half integers", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETHALF(i,RdV,fGETHALF(i,RtV)-fGETHALF(i,RsV)); + } +}) + +Q6INSN(A2_svsubhs,"Rd32=vsubh(Rt32,Rs32):sat",ATTRIBS(), +"Sub vector of half integers with saturation", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETHALF(i,RdV,fSATN(16,fGETHALF(i,RtV)-fGETHALF(i,RsV))); + } +}) + +Q6INSN(A2_svsubuhs,"Rd32=vsubuh(Rt32,Rs32):sat",ATTRIBS(), +"Sub vector of unsigned half integers with saturation", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETHALF(i,RdV,fSATUN(16,fGETUHALF(i,RtV)-fGETUHALF(i,RsV))); + } +}) + + + + +/**********************************************/ +/* Vector Reduce Add */ +/**********************************************/ + +Q6INSN(A2_vraddub,"Rdd32=vraddub(Rss32,Rtt32)",ATTRIBS(), +"Sum: two vectors of unsigned bytes", +{ + fHIDE(int i;) + RddV = 0; + for (i=0;i<4;i++) { + fSETWORD(0,RddV,(fGETWORD(0,RddV) + (fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV)))); + } + for (i=4;i<8;i++) { + fSETWORD(1,RddV,(fGETWORD(1,RddV) + (fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV)))); + } +}) + +Q6INSN(A2_vraddub_acc,"Rxx32+=vraddub(Rss32,Rtt32)",ATTRIBS(), +"Sum: two vectors of unsigned bytes", +{ + fHIDE(int i;) + for (i = 0; i < 4; i++) { + fSETWORD(0,RxxV,(fGETWORD(0,RxxV) + (fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV)))); + } + for (i = 4; i < 8; i++) { + fSETWORD(1,RxxV,(fGETWORD(1,RxxV) + (fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV)))); + } + fACC();\ +}) + + + +Q6INSN(M2_vraddh,"Rd32=vraddh(Rss32,Rtt32)",ATTRIBS(A_ARCHV3), +"Sum: two vectors of halves", +{ + fHIDE(int i;) + RdV = 0; + for (i=0;i<4;i++) { + RdV += (fGETHALF(i,RssV)+fGETHALF(i,RttV)); + } +}) + +Q6INSN(M2_vradduh,"Rd32=vradduh(Rss32,Rtt32)",ATTRIBS(A_ARCHV3), +"Sum: two vectors of unsigned halves", +{ + fHIDE(int i;) + RdV = 0; + for (i=0;i<4;i++) { + RdV += (fGETUHALF(i,RssV)+fGETUHALF(i,RttV)); + } +}) + +/**********************************************/ +/* Vector Sub */ +/**********************************************/ + +Q6INSN(A2_vsubub,"Rdd32=vsubub(Rtt32,Rss32)",ATTRIBS(), +"Sub vector of bytes", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBYTE(i,RddV,(fGETUBYTE(i,RttV)-fGETUBYTE(i,RssV))); + } +}) +DEF_MAPPING(A2_vsubb_map,"Rdd32=vsubb(Rss32,Rtt32)","Rdd32=vsubub(Rss32,Rtt32)") + +Q6INSN(A2_vsububs,"Rdd32=vsubub(Rtt32,Rss32):sat",ATTRIBS(), +"Sub vector of bytes", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBYTE(i,RddV,fSATUN(8,fGETUBYTE(i,RttV)-fGETUBYTE(i,RssV))); + } +}) + +Q6INSN(A2_vsubh,"Rdd32=vsubh(Rtt32,Rss32)",ATTRIBS(), +"Sub vector of half integers", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,fGETHALF(i,RttV)-fGETHALF(i,RssV)); + } +}) + +Q6INSN(A2_vsubhs,"Rdd32=vsubh(Rtt32,Rss32):sat",ATTRIBS(), +"Sub vector of half integers with saturation", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,fSATN(16,fGETHALF(i,RttV)-fGETHALF(i,RssV))); + } +}) + +Q6INSN(A2_vsubuhs,"Rdd32=vsubuh(Rtt32,Rss32):sat",ATTRIBS(), +"Sub vector of unsigned half integers with saturation", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,fSATUN(16,fGETUHALF(i,RttV)-fGETUHALF(i,RssV))); + } +}) + +Q6INSN(A2_vsubw,"Rdd32=vsubw(Rtt32,Rss32)",ATTRIBS(), +"Sub vector of words", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,fGETWORD(i,RttV)-fGETWORD(i,RssV)); + } +}) + +Q6INSN(A2_vsubws,"Rdd32=vsubw(Rtt32,Rss32):sat",ATTRIBS(), +"Sub vector of words with saturation", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,fSATN(32,fGETWORD(i,RttV)-fGETWORD(i,RssV))); + } +}) + + + + +/**********************************************/ +/* Vector Abs */ +/**********************************************/ + +Q6INSN(A2_vabsh,"Rdd32=vabsh(Rss32)",ATTRIBS(), +"Negate vector of half integers", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,fABS(fGETHALF(i,RssV))); + } +}) + +Q6INSN(A2_vabshsat,"Rdd32=vabsh(Rss32):sat",ATTRIBS(), +"Negate vector of half integers", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,fSATH(fABS(fGETHALF(i,RssV)))); + } +}) + +Q6INSN(A2_vabsw,"Rdd32=vabsw(Rss32)",ATTRIBS(), +"Absolute Value vector of words", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,fABS(fGETWORD(i,RssV))); + } +}) + +Q6INSN(A2_vabswsat,"Rdd32=vabsw(Rss32):sat",ATTRIBS(), +"Absolute Value vector of words", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,fSAT(fABS(fGETWORD(i,RssV)))); + } +}) + +/**********************************************/ +/* Vector SAD */ +/**********************************************/ + + +Q6INSN(M2_vabsdiffw,"Rdd32=vabsdiffw(Rtt32,Rss32)",ATTRIBS(A_ARCHV2), +"Absolute Differences: vector of words", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,fABS(fGETWORD(i,RttV) - fGETWORD(i,RssV))); + } +}) + +Q6INSN(M2_vabsdiffh,"Rdd32=vabsdiffh(Rtt32,Rss32)",ATTRIBS(A_ARCHV2), +"Absolute Differences: vector of halfwords", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,fABS(fGETHALF(i,RttV) - fGETHALF(i,RssV))); + } +}) + +Q6INSN(M6_vabsdiffb,"Rdd32=vabsdiffb(Rtt32,Rss32)",ATTRIBS(), +"Absolute Differences: vector of halfwords", +{ + fHIDE(int i;) + for (i=0;i<8;i++) { + fSETBYTE(i,RddV,fABS(fGETBYTE(i,RttV) - fGETBYTE(i,RssV))); + } +}) + +Q6INSN(M6_vabsdiffub,"Rdd32=vabsdiffub(Rtt32,Rss32)",ATTRIBS(), +"Absolute Differences: vector of halfwords", +{ + fHIDE(int i;) + for (i=0;i<8;i++) { + fSETBYTE(i,RddV,fABS(fGETUBYTE(i,RttV) - fGETUBYTE(i,RssV))); + } +}) + + + +Q6INSN(A2_vrsadub,"Rdd32=vrsadub(Rss32,Rtt32)",ATTRIBS(), +"Sum of Absolute Differences: vector of unsigned bytes", +{ + fHIDE(int i;) + RddV = 0; + for (i = 0; i < 4; i++) { + fSETWORD(0,RddV,(fGETWORD(0,RddV) + fABS((fGETUBYTE(i,RssV) - fGETUBYTE(i,RttV))))); + } + for (i = 4; i < 8; i++) { + fSETWORD(1,RddV,(fGETWORD(1,RddV) + fABS((fGETUBYTE(i,RssV) - fGETUBYTE(i,RttV))))); + } +}) + +Q6INSN(A2_vrsadub_acc,"Rxx32+=vrsadub(Rss32,Rtt32)",ATTRIBS(), +"Sum of Absolute Differences: vector of unsigned bytes", +{ + fHIDE(int i;) + for (i = 0; i < 4; i++) { + fSETWORD(0,RxxV,(fGETWORD(0,RxxV) + fABS((fGETUBYTE(i,RssV) - fGETUBYTE(i,RttV))))); + } + for (i = 4; i < 8; i++) { + fSETWORD(1,RxxV,(fGETWORD(1,RxxV) + fABS((fGETUBYTE(i,RssV) - fGETUBYTE(i,RttV))))); + } + fACC();\ +}) + + +/**********************************************/ +/* Vector Average */ +/**********************************************/ + +Q6INSN(A2_vavgub,"Rdd32=vavgub(Rss32,Rtt32)",ATTRIBS(), +"Average vector of unsigned bytes", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBYTE(i,RddV,((fGETUBYTE(i,RssV) + fGETUBYTE(i,RttV))>>1)); + } +}) + +Q6INSN(A2_vavguh,"Rdd32=vavguh(Rss32,Rtt32)",ATTRIBS(), +"Average vector of unsigned halfwords", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,(fGETUHALF(i,RssV)+fGETUHALF(i,RttV))>>1); + } +}) + +Q6INSN(A2_vavgh,"Rdd32=vavgh(Rss32,Rtt32)",ATTRIBS(), +"Average vector of halfwords", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,(fGETHALF(i,RssV)+fGETHALF(i,RttV))>>1); + } +}) + +Q6INSN(A2_vnavgh,"Rdd32=vnavgh(Rtt32,Rss32)",ATTRIBS(), +"Negative Average vector of halfwords", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,(fGETHALF(i,RttV)-fGETHALF(i,RssV))>>1); + } +}) + +Q6INSN(A2_vavgw,"Rdd32=vavgw(Rss32,Rtt32)",ATTRIBS(), +"Average vector of words", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,(fSXTN(32,33,fGETWORD(i,RssV))+fSXTN(32,33,fGETWORD(i,RttV)))>>1); + } +}) + +Q6INSN(A2_vnavgw,"Rdd32=vnavgw(Rtt32,Rss32)",ATTRIBS(A_ARCHV2), +"Average vector of words", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,(fSXTN(32,33,fGETWORD(i,RttV))-fSXTN(32,33,fGETWORD(i,RssV)))>>1); + } +}) + +Q6INSN(A2_vavgwr,"Rdd32=vavgw(Rss32,Rtt32):rnd",ATTRIBS(), +"Average vector of words", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,(fSXTN(32,33,fGETWORD(i,RssV))+fSXTN(32,33,fGETWORD(i,RttV))+1)>>1); + } +}) + +Q6INSN(A2_vnavgwr,"Rdd32=vnavgw(Rtt32,Rss32):rnd:sat",ATTRIBS(A_ARCHV2), +"Average vector of words", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,fSAT((fSXTN(32,33,fGETWORD(i,RttV))-fSXTN(32,33,fGETWORD(i,RssV))+1)>>1)); + } +}) + +Q6INSN(A2_vavgwcr,"Rdd32=vavgw(Rss32,Rtt32):crnd",ATTRIBS(A_ARCHV2), +"Average vector of words with convergent rounding", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,(fCRND(fSXTN(32,33,fGETWORD(i,RssV))+fSXTN(32,33,fGETWORD(i,RttV)))>>1)); + } +}) + +Q6INSN(A2_vnavgwcr,"Rdd32=vnavgw(Rtt32,Rss32):crnd:sat",ATTRIBS(A_ARCHV2), +"Average negative vector of words with convergent rounding", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,fSAT(fCRND(fSXTN(32,33,fGETWORD(i,RttV))-fSXTN(32,33,fGETWORD(i,RssV)))>>1)); + } +}) + +Q6INSN(A2_vavghcr,"Rdd32=vavgh(Rss32,Rtt32):crnd",ATTRIBS(A_ARCHV2), +"Average vector of halfwords with conv rounding", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,fCRND(fGETHALF(i,RssV)+fGETHALF(i,RttV))>>1); + } +}) + +Q6INSN(A2_vnavghcr,"Rdd32=vnavgh(Rtt32,Rss32):crnd:sat",ATTRIBS(A_ARCHV2), +"Average negative vector of halfwords with conv rounding", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,fSATH(fCRND(fGETHALF(i,RttV)-fGETHALF(i,RssV))>>1)); + } +}) + + +Q6INSN(A2_vavguw,"Rdd32=vavguw(Rss32,Rtt32)",ATTRIBS(), +"Average vector of unsigned words", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,(fZXTN(32,33,fGETUWORD(i,RssV))+fZXTN(32,33,fGETUWORD(i,RttV)))>>1); + } +}) + +Q6INSN(A2_vavguwr,"Rdd32=vavguw(Rss32,Rtt32):rnd",ATTRIBS(), +"Average vector of unsigned words", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,(fZXTN(32,33,fGETUWORD(i,RssV))+fZXTN(32,33,fGETUWORD(i,RttV))+1)>>1); + } +}) + +Q6INSN(A2_vavgubr,"Rdd32=vavgub(Rss32,Rtt32):rnd",ATTRIBS(), +"Average vector of unsigned bytes", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBYTE(i,RddV,((fGETUBYTE(i,RssV)+fGETUBYTE(i,RttV)+1)>>1)); + } +}) + +Q6INSN(A2_vavguhr,"Rdd32=vavguh(Rss32,Rtt32):rnd",ATTRIBS(), +"Average vector of unsigned halfwords with rounding", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,(fGETUHALF(i,RssV)+fGETUHALF(i,RttV)+1)>>1); + } +}) + +Q6INSN(A2_vavghr,"Rdd32=vavgh(Rss32,Rtt32):rnd",ATTRIBS(), +"Average vector of halfwords with rounding", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,(fGETHALF(i,RssV)+fGETHALF(i,RttV)+1)>>1); + } +}) + +Q6INSN(A2_vnavghr,"Rdd32=vnavgh(Rtt32,Rss32):rnd:sat",ATTRIBS(A_ARCHV2), +"Negative Average vector of halfwords with rounding", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV,fSATH((fGETHALF(i,RttV)-fGETHALF(i,RssV)+1)>>1)); + } +}) + + +/* Rounding Instruction */ + +Q6INSN(A4_round_ri,"Rd32=round(Rs32,#u5)",ATTRIBS(),"Round", {RdV = fRNDN(RsV,uiV)>>uiV; }) +Q6INSN(A4_round_rr,"Rd32=round(Rs32,Rt32)",ATTRIBS(),"Round", {RdV = fRNDN(RsV,fZXTN(5,32,RtV))>>fZXTN(5,32,RtV); }) +Q6INSN(A4_round_ri_sat,"Rd32=round(Rs32,#u5):sat",ATTRIBS(),"Round", {RdV = (fSAT(fRNDN(RsV,uiV)))>>uiV; }) +Q6INSN(A4_round_rr_sat,"Rd32=round(Rs32,Rt32):sat",ATTRIBS(),"Round", {RdV = (fSAT(fRNDN(RsV,fZXTN(5,32,RtV))))>>fZXTN(5,32,RtV); }) + + +Q6INSN(A4_cround_ri,"Rd32=cround(Rs32,#u5)",ATTRIBS(),"Convergent Round", {RdV = fCRNDN(RsV,uiV); }) +Q6INSN(A4_cround_rr,"Rd32=cround(Rs32,Rt32)",ATTRIBS(),"Convergent Round", {RdV = fCRNDN(RsV,fZXTN(5,32,RtV)); }) + + +#define CROUND(DST,SRC,SHIFT) \ + fHIDE(size16s_t rndbit_128;)\ + fHIDE(size16s_t tmp128;)\ + fHIDE(size16s_t src_128;)\ + if (SHIFT == 0) { \ + DST = SRC;\ + } else if ((SRC & (size8s_t)((1LL << (SHIFT - 1)) - 1LL)) == 0) { \ + src_128 = fCAST8S_16S(SRC);\ + rndbit_128 = fCAST8S_16S(1LL);\ + rndbit_128 = fSHIFTL128(rndbit_128, SHIFT);\ + rndbit_128 = fAND128(rndbit_128, src_128);\ + rndbit_128 = fSHIFTR128(rndbit_128, 1);\ + tmp128 = fADD128(src_128, rndbit_128);\ + tmp128 = fSHIFTR128(tmp128, SHIFT);\ + DST = fCAST16S_8S(tmp128);\ + } else {\ + size16s_t rndbit_128 = fCAST8S_16S((1LL << (SHIFT - 1))); \ + size16s_t src_128 = fCAST8S_16S(SRC); \ + size16s_t tmp128 = fADD128(src_128, rndbit_128);\ + tmp128 = fSHIFTR128(tmp128, SHIFT);\ + DST = fCAST16S_8S(tmp128);\ + } + +Q6INSN(A7_croundd_ri,"Rdd32=cround(Rss32,#u6)",ATTRIBS(),"Convergent Round", +{ +CROUND(RddV,RssV,uiV); +fEXTENSION_AUDIO();}) + +Q6INSN(A7_croundd_rr,"Rdd32=cround(Rss32,Rt32)",ATTRIBS(),"Convergent Round", +{ +CROUND(RddV,RssV,fZXTN(6,32,RtV)); +fEXTENSION_AUDIO();}) + + + + + + + + + +Q6INSN(A7_clip,"Rd32=clip(Rs32,#u5)",ATTRIBS(),"Clip to #s5", { fCLIP(RdV,RsV,uiV); fEXTENSION_AUDIO();}) +Q6INSN(A7_vclip,"Rdd32=vclip(Rss32,#u5)",ATTRIBS(),"Clip to #s5", +{ +fHIDE(size4s_t tmp;) +fCLIP(tmp, fGETWORD(0, RssV), uiV); +fSETWORD(0, RddV, tmp); +fCLIP(tmp,fGETWORD(1, RssV), uiV); +fSETWORD(1, RddV, tmp); +fEXTENSION_AUDIO(); +} +) + + + +/**********************************************/ +/* V4: Cross Vector Min/Max */ +/**********************************************/ + + +#define VRMINORMAX(TAG,STR,OP,SHORTTYPE,SETTYPE,GETTYPE,NEL,SHIFT) \ +Q6INSN(A4_vr##TAG##SHORTTYPE,"Rxx32=vr"#TAG#SHORTTYPE"(Rss32,Ru32)",ATTRIBS(), \ +"Choose " STR " elements of a vector", \ +{ \ + fHIDE(int i; size8s_t TAG; size4s_t addr;) \ + TAG = fGET##GETTYPE(0,RxxV); \ + addr = fGETWORD(1,RxxV); \ + for (i = 0; i < NEL; i++) { \ + if (TAG OP fGET##GETTYPE(i,RssV)) { \ + TAG = fGET##GETTYPE(i,RssV); \ + addr = RuV | i<,SHORTTYPE,SETTYPE,GETTYPE,NEL,SHIFT) \ +VRMINORMAX(max,"maximum",<,SHORTTYPE,SETTYPE,GETTYPE,NEL,SHIFT) + + +RMINMAX(h,HALF,HALF,4,1) +RMINMAX(uh,HALF,UHALF,4,1) +RMINMAX(w,WORD,WORD,2,2) +RMINMAX(uw,WORD,UWORD,2,2) + +#undef RMINMAX +#undef VRMINORMAX + +/**********************************************/ +/* Vector Min/Max */ +/**********************************************/ + +#define VMINORMAX(TAG,STR,FUNC,SHORTTYPE,SETTYPE,GETTYPE,NEL) \ +Q6INSN(A2_v##TAG##SHORTTYPE,"Rdd32=v"#TAG#SHORTTYPE"(Rtt32,Rss32)",ATTRIBS(), \ +"Choose " STR " elements of two vectors", \ +{ \ + fHIDE(int i;) \ + for (i = 0; i < NEL; i++) { \ + fSET##SETTYPE(i,RddV,FUNC(fGET##GETTYPE(i,RttV),fGET##GETTYPE(i,RssV))); \ + } \ +}) + +#define VMINORMAX3(TAG,STR,FUNC,SHORTTYPE,SETTYPE,GETTYPE,NEL) \ +Q6INSN(A6_v##TAG##SHORTTYPE##3,"Rxx32=v"#TAG#SHORTTYPE"3(Rtt32,Rss32)",ATTRIBS(), \ +"Choose " STR " elements of two vectors", \ +{ \ + fHIDE(int i;) \ + for (i = 0; i < NEL; i++) { \ + fSET##SETTYPE(i,RxxV,FUNC(fGET##GETTYPE(i,RxxV),FUNC(fGET##GETTYPE(i,RttV),fGET##GETTYPE(i,RssV)))); \ + } \ +}) + +#define MINMAX(SHORTTYPE,SETTYPE,GETTYPE,NEL) \ +VMINORMAX(min,"minimum",fMIN,SHORTTYPE,SETTYPE,GETTYPE,NEL) \ +VMINORMAX(max,"maximum",fMAX,SHORTTYPE,SETTYPE,GETTYPE,NEL) + +MINMAX(b,BYTE,BYTE,8) +MINMAX(ub,BYTE,UBYTE,8) +MINMAX(h,HALF,HALF,4) +MINMAX(uh,HALF,UHALF,4) +MINMAX(w,WORD,WORD,2) +MINMAX(uw,WORD,UWORD,2) + +#undef MINMAX +#undef VMINORMAX +#undef VMINORMAX3 + + +Q6INSN(A5_ACS,"Rxx32,Pe4=vacsh(Rss32,Rtt32)",ATTRIBS(A_NOTE_LATEPRED,A_RESTRICT_LATEPRED), +"Add Compare and Select elements of two vectors, record the maximums and the decisions ", +{ + fHIDE(int i;) + fHIDE(int xv;) + fHIDE(int sv;) + fHIDE(int tv;) + for (i = 0; i < 4; i++) { + xv = (int) fGETHALF(i,RxxV); + sv = (int) fGETHALF(i,RssV); + tv = (int) fGETHALF(i,RttV); + xv = xv + tv; //assumes 17bit datapath + sv = sv - tv; //assumes 17bit datapath + fSETBIT(i*2, PeV, (xv > sv)); + fSETBIT(i*2+1,PeV, (xv > sv)); + fSETHALF(i, RxxV, fSATH(fMAX(xv,sv))); + } +}) + +Q6INSN(A6_vminub_RdP,"Rdd32,Pe4=vminub(Rtt32,Rss32)",ATTRIBS(A_NOTE_LATEPRED,A_RESTRICT_LATEPRED), +"Vector minimum of bytes, records minimum and decision vector", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBIT(i, PeV, (fGETUBYTE(i,RttV) > fGETUBYTE(i,RssV))); + fSETBYTE(i,RddV,fMIN(fGETUBYTE(i,RttV),fGETUBYTE(i,RssV))); + } +}) + +/**********************************************/ +/* Vector Min/Max */ +/**********************************************/ + + +Q6INSN(A4_modwrapu,"Rd32=modwrap(Rs32,Rt32)",ATTRIBS(), +"Wrap to an unsigned modulo buffer", +{ + if (RsV < 0) { + RdV = RsV + fCAST4u(RtV); + } else if (fCAST4u(RsV) >= fCAST4u(RtV)) { + RdV = RsV - fCAST4u(RtV); + } else { + RdV = RsV; + } +}) + diff --git a/target/hexagon/imported/branch.idef b/target/hexagon/imported/branch.idef new file mode 100644 index 0000000..142c1d5 --- /dev/null +++ b/target/hexagon/imported/branch.idef @@ -0,0 +1,344 @@ +/* + * Copyright (c) 2019 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + + +/*********************************************/ +/* Jump instructions */ +/*********************************************/ + +#define A_COFSTD A_RESTRICT_LOOP_LA,A_RESTRICT_COF_MAX1 +#define A_JDIR A_JUMP,A_DIRECT,A_COFSTD,A_BRANCHADDER +#define A_CJNEWDIR A_JUMP,A_DIRECT,A_CJUMP,A_COFSTD,A_BRANCHADDER,A_DOTNEW,A_ROPS_2 +#define A_CJOLDDIR A_JUMP,A_DIRECT,A_CJUMP,A_COFSTD,A_BRANCHADDER,A_DOTOLD +#define A_NEWVALUEJ A_JUMP,A_DIRECT,A_CJUMP,A_COFSTD,A_BRANCHADDER,A_RESTRICT_COF_MAX1,A_DOTNEWVALUE,A_MEMLIKE_PACKET_RULES,A_RESTRICT_SINGLE_MEM_FIRST,A_ROPS_2 +#define A_JINDIR A_JUMP,A_INDIRECT,A_COFSTD +#define A_JINDIRNEW A_JUMP,A_INDIRECT,A_COFSTD,A_CJUMP,A_DOTNEW +#define A_JINDIROLD A_JUMP,A_INDIRECT,A_COFSTD,A_CJUMP,A_DOTOLD + +#define A_RELAX_COND A_RESTRICT_COF_MAX1,A_RELAX_COF_2ND,A_RELAX_COF_1ST +#define A_RELAX_UNC A_RESTRICT_COF_MAX1,A_RELAX_COF_2ND + +Q6INSN(J2_jump,"jump #r22:2",ATTRIBS(A_JDIR,A_RELAX_UNC), "direct unconditional jump", +{fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}) + +Q6INSN(J2_jumpr,"jumpr Rs32",ATTRIBS(A_JINDIR), "indirect unconditional jump", +{fJUMPR(RsN,RsV,COF_TYPE_JUMPR);}) + +#define OLDCOND_JUMP(TAG,OPER,OPER2,ATTRIB,DESCR,SEMANTICS) \ +Q6INSN(TAG##t,"if (Pu4) "OPER":nt "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBOLD(PuV),,SPECULATE_NOT_TAKEN,12,0); if (fLSBOLD(PuV)) { SEMANTICS; }}) \ +Q6INSN(TAG##f,"if (!Pu4) "OPER":nt "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBOLDNOT(PuV),,SPECULATE_NOT_TAKEN,12,0); if (fLSBOLDNOT(PuV)) { SEMANTICS; }}) \ +Q6INSN(TAG##tpt,"if (Pu4) "OPER":t "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBOLD(PuV),,SPECULATE_TAKEN,12,0); if (fLSBOLD(PuV)) { SEMANTICS; }}) \ +Q6INSN(TAG##fpt,"if (!Pu4) "OPER":t "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBOLDNOT(PuV),,SPECULATE_TAKEN,12,0); if (fLSBOLDNOT(PuV)) { SEMANTICS; }}) \ +DEF_MAPPING(TAG##t_nopred_map,"if (Pu4) "OPER" "OPER2,"if (Pu4) "OPER":nt "OPER2) \ +DEF_MAPPING(TAG##f_nopred_map,"if (!Pu4) "OPER" "OPER2,"if (!Pu4) "OPER":nt "OPER2) + +OLDCOND_JUMP(J2_jump,"jump","#r15:2",ATTRIBS(A_CJOLDDIR,A_NOTE_CONDITIONAL,A_RELAX_COND,A_PRED_BIT_12),"direct conditional jump", +fIMMEXT(riV);fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);) + +OLDCOND_JUMP(J2_jumpr,"jumpr","Rs32",ATTRIBS(A_JINDIROLD,A_NOTE_CONDITIONAL,A_PRED_BIT_12),"indirect conditional jump", +fJUMPR(RsN,RsV,COF_TYPE_JUMPR);) + +#define NEWCOND_JUMP(TAG,OPER,OPER2,ATTRIB,DESCR,SEMANTICS)\ +Q6INSN(TAG##tnew,"if (Pu4.new) "OPER":nt "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBNEW(PuN),, SPECULATE_NOT_TAKEN , 12,0)} {if(fLSBNEW(PuN)){SEMANTICS;}})\ +Q6INSN(TAG##fnew,"if (!Pu4.new) "OPER":nt "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBNEWNOT(PuN),, SPECULATE_NOT_TAKEN , 12,0)} {if(fLSBNEWNOT(PuN)){SEMANTICS;}})\ +Q6INSN(TAG##tnewpt,"if (Pu4.new) "OPER":t "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBNEW(PuN),, SPECULATE_TAKEN , 12,0)} {if(fLSBNEW(PuN)){SEMANTICS;}})\ +Q6INSN(TAG##fnewpt,"if (!Pu4.new) "OPER":t "OPER2,ATTRIB,DESCR,{fBRANCH_SPECULATE_STALL(fLSBNEWNOT(PuN),, SPECULATE_TAKEN , 12,0)} {if(fLSBNEWNOT(PuN)){SEMANTICS;}}) + +NEWCOND_JUMP(J2_jump,"jump","#r15:2",ATTRIBS(A_CJNEWDIR,A_NOTE_CONDITIONAL,A_ARCHV2,A_RELAX_COND,A_PRED_BIT_12),"direct conditional jump", +fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMPNEW);) + +NEWCOND_JUMP(J2_jumpr,"jumpr","Rs32",ATTRIBS(A_JINDIRNEW,A_NOTE_CONDITIONAL,A_ARCHV3,A_PRED_BIT_12),"indirect conditional jump", +fJUMPR(RsN,RsV,COF_TYPE_JUMPR);) + + + +Q6INSN(J4_hintjumpr,"hintjr(Rs32)",ATTRIBS(A_JINDIR,A_HINTJR),"hint indirect conditional jump", +{fHINTJR(RsV);}) + + +/*********************************************/ +/* Compound Compare-Jumps */ +/*********************************************/ +Q6INSN(J2_jumprz,"if (Rs32!=#0) jump:nt #r13:2",ATTRIBS(A_NOTE_DEPRECATED,A_CJNEWDIR,A_ARCHV3,A_RELAX_COND,A_PRED_BIT_12),"direct conditional jump if register true", +{fBRANCH_SPECULATE_STALL((RsV!=0), , SPECULATE_NOT_TAKEN,12,0) if (RsV != 0) { fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}}) + +Q6INSN(J2_jumprnz,"if (Rs32==#0) jump:nt #r13:2",ATTRIBS(A_NOTE_DEPRECATED,A_CJNEWDIR,A_ARCHV3,A_RELAX_COND,A_PRED_BIT_12),"direct conditional jump if register false", +{fBRANCH_SPECULATE_STALL((RsV==0), , SPECULATE_NOT_TAKEN,12,0) if (RsV == 0) {fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}}) + +Q6INSN(J2_jumprzpt,"if (Rs32!=#0) jump:t #r13:2",ATTRIBS(A_NOTE_DEPRECATED,A_CJNEWDIR,A_ARCHV3,A_RELAX_COND,A_PRED_BIT_12),"direct conditional jump if register true", +{fBRANCH_SPECULATE_STALL((RsV!=0), , SPECULATE_TAKEN,12,0) if (RsV != 0) { fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}}) + +Q6INSN(J2_jumprnzpt,"if (Rs32==#0) jump:t #r13:2",ATTRIBS(A_NOTE_DEPRECATED,A_CJNEWDIR,A_ARCHV3,A_RELAX_COND,A_PRED_BIT_12),"direct conditional jump if register false", +{fBRANCH_SPECULATE_STALL((RsV==0), , SPECULATE_TAKEN,12,0) if (RsV == 0) {fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}}) + +Q6INSN(J2_jumprgtez,"if (Rs32>=#0) jump:nt #r13:2",ATTRIBS(A_NOTE_DEPRECATED,A_CJNEWDIR,A_ARCHV3,A_RELAX_COND,A_PRED_BIT_12),"direct conditional jump if register greater or equal to zero", +{fBRANCH_SPECULATE_STALL((RsV>=0), , SPECULATE_NOT_TAKEN,12,0) if (RsV>=0) { fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}}) + +Q6INSN(J2_jumprgtezpt,"if (Rs32>=#0) jump:t #r13:2",ATTRIBS(A_NOTE_DEPRECATED,A_CJNEWDIR,A_ARCHV3,A_RELAX_COND,A_PRED_BIT_12),"direct conditional jump if register greater or equal to zero", +{fBRANCH_SPECULATE_STALL((RsV>=0), , SPECULATE_TAKEN,12,0) if (RsV>=0) { fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}}) + +Q6INSN(J2_jumprltez,"if (Rs32<=#0) jump:nt #r13:2",ATTRIBS(A_NOTE_DEPRECATED,A_CJNEWDIR,A_ARCHV3,A_RELAX_COND,A_PRED_BIT_12),"direct conditional jump if register less than or equal to zero", +{fBRANCH_SPECULATE_STALL((RsV<=0), , SPECULATE_NOT_TAKEN,12,0) if (RsV<=0) { fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}}) + +Q6INSN(J2_jumprltezpt,"if (Rs32<=#0) jump:t #r13:2",ATTRIBS(A_NOTE_DEPRECATED,A_CJNEWDIR,A_ARCHV3,A_RELAX_COND,A_PRED_BIT_12),"direct conditional jump if register less than or equal to zero", +{fBRANCH_SPECULATE_STALL((RsV<=0), , SPECULATE_TAKEN,12,0) if (RsV<=0) { fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}}) + + + +/*********************************************/ +/* V4 Compound Compare-Jumps */ +/*********************************************/ + + +/* V4 compound compare jumps (CJ) */ +#define STD_CMPJUMP(TAG,TST,TSTSEM)\ +Q6INSN(J4_##TAG##_tp0_jump_nt, "p0="TST"; if (p0.new) jump:nt #r9:2", ATTRIBS(A_CJNEWDIR,A_RELAX_COND,A_NEWCMPJUMP,A_PRED_BIT_13),"compound compare-jump", {fPART1(fWRITE_P0(f8BITSOF(TSTSEM))) fBRANCH_SPECULATE_STALL(fLSBNEW0,,SPECULATE_NOT_TAKEN,13,0) if (fLSBNEW0) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\ +Q6INSN(J4_##TAG##_fp0_jump_nt, "p0="TST"; if (!p0.new) jump:nt #r9:2", ATTRIBS(A_CJNEWDIR,A_RELAX_COND,A_NEWCMPJUMP,A_PRED_BIT_13),"compound compare-jump",{fPART1(fWRITE_P0(f8BITSOF(TSTSEM))) fBRANCH_SPECULATE_STALL(fLSBNEW0NOT,,SPECULATE_NOT_TAKEN,13,0) if (fLSBNEW0NOT) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\ +Q6INSN(J4_##TAG##_tp0_jump_t, "p0="TST"; if (p0.new) jump:t #r9:2", ATTRIBS(A_CJNEWDIR,A_RELAX_COND,A_NEWCMPJUMP,A_PRED_BIT_13),"compound compare-jump", {fPART1(fWRITE_P0(f8BITSOF(TSTSEM))) fBRANCH_SPECULATE_STALL(fLSBNEW0,,SPECULATE_TAKEN,13,0) if (fLSBNEW0) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\ +Q6INSN(J4_##TAG##_fp0_jump_t, "p0="TST"; if (!p0.new) jump:t #r9:2", ATTRIBS(A_CJNEWDIR,A_RELAX_COND,A_NEWCMPJUMP,A_PRED_BIT_13),"compound compare-jump", {fPART1(fWRITE_P0(f8BITSOF(TSTSEM))) fBRANCH_SPECULATE_STALL(fLSBNEW0NOT,,SPECULATE_TAKEN,13,0) if (fLSBNEW0NOT) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\ +Q6INSN(J4_##TAG##_tp1_jump_nt, "p1="TST"; if (p1.new) jump:nt #r9:2", ATTRIBS(A_CJNEWDIR,A_RELAX_COND,A_NEWCMPJUMP,A_PRED_BIT_13),"compound compare-jump", {fPART1(fWRITE_P1(f8BITSOF(TSTSEM))) fBRANCH_SPECULATE_STALL(fLSBNEW1,,SPECULATE_NOT_TAKEN,13,0) if (fLSBNEW1) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\ +Q6INSN(J4_##TAG##_fp1_jump_nt, "p1="TST"; if (!p1.new) jump:nt #r9:2", ATTRIBS(A_CJNEWDIR,A_RELAX_COND,A_NEWCMPJUMP,A_PRED_BIT_13),"compound compare-jump",{fPART1(fWRITE_P1(f8BITSOF(TSTSEM))) fBRANCH_SPECULATE_STALL(fLSBNEW1NOT,,SPECULATE_NOT_TAKEN,13,0) if (fLSBNEW1NOT) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\ +Q6INSN(J4_##TAG##_tp1_jump_t, "p1="TST"; if (p1.new) jump:t #r9:2", ATTRIBS(A_CJNEWDIR,A_RELAX_COND,A_NEWCMPJUMP,A_PRED_BIT_13),"compound compare-jump", {fPART1(fWRITE_P1(f8BITSOF(TSTSEM))) fBRANCH_SPECULATE_STALL(fLSBNEW1,,SPECULATE_TAKEN,13,0) if (fLSBNEW1) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\ +Q6INSN(J4_##TAG##_fp1_jump_t, "p1="TST"; if (!p1.new) jump:t #r9:2", ATTRIBS(A_CJNEWDIR,A_RELAX_COND,A_NEWCMPJUMP,A_PRED_BIT_13),"compound compare-jump", {fPART1(fWRITE_P1(f8BITSOF(TSTSEM))) fBRANCH_SPECULATE_STALL(fLSBNEW1NOT,,SPECULATE_TAKEN,13,0) if (fLSBNEW1NOT) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}}) + + +STD_CMPJUMP(cmpeqi,"cmp.eq(Rs16,#U5)",(RsV==UiV)) +STD_CMPJUMP(cmpgti,"cmp.gt(Rs16,#U5)",(RsV>UiV)) +STD_CMPJUMP(cmpgtui,"cmp.gtu(Rs16,#U5)",(fCAST4u(RsV)>UiV)) + +STD_CMPJUMP(cmpeqn1,"cmp.eq(Rs16,#-1)",(RsV==-1)) +STD_CMPJUMP(cmpgtn1,"cmp.gt(Rs16,#-1)",(RsV>-1)) +STD_CMPJUMP(tstbit0,"tstbit(Rs16,#0)",(RsV & 1)) + +STD_CMPJUMP(cmpeq,"cmp.eq(Rs16,Rt16)",(RsV==RtV)) +STD_CMPJUMP(cmpgt,"cmp.gt(Rs16,Rt16)",(RsV>RtV)) +STD_CMPJUMP(cmpgtu,"cmp.gtu(Rs16,Rt16)",(fCAST4u(RsV)>RtV)) + + + +/* V4 jump and transfer (CJ) */ +Q6INSN(J4_jumpseti,"Rd16=#U6 ; jump #r9:2",ATTRIBS(A_JDIR,A_RELAX_UNC), "direct unconditional jump and set register to immediate", +{fIMMEXT(riV); fPCALIGN(riV); RdV=UiV; fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}) + +Q6INSN(J4_jumpsetr,"Rd16=Rs16 ; jump #r9:2",ATTRIBS(A_JDIR,A_RELAX_UNC), "direct unconditional jump and transfer register", +{fIMMEXT(riV); fPCALIGN(riV); RdV=RsV; fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}) + + +/* V4 new-value jumps (NCJ) */ +#define STD_CMPJUMPNEWRS(TAG,TST,TSTSEM)\ +Q6INSN(J4_##TAG##_jumpnv_t, "if ("TST") jump:t #r9:2", ATTRIBS(A_NEWVALUEJ,A_RESTRICT_NOSLOT1_STORE,A_RESTRICT_SLOT0ONLY,A_PRED_BIT_13),"compound compare-jump",{fBRANCH_SPECULATE_STALL(TSTSEM,,SPECULATE_TAKEN,13,0);if (TSTSEM) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}})\ +Q6INSN(J4_##TAG##_jumpnv_nt,"if ("TST") jump:nt #r9:2",ATTRIBS(A_NEWVALUEJ,A_RESTRICT_NOSLOT1_STORE,A_RESTRICT_SLOT0ONLY,A_PRED_BIT_13),"compound compare-jump",{fBRANCH_SPECULATE_STALL(TSTSEM,,SPECULATE_NOT_TAKEN,13,0); if (TSTSEM) {fIMMEXT(riV); fPCALIGN(riV); fBRANCH(fREAD_PC()+riV,COF_TYPE_JUMP);}}) + + + + +STD_CMPJUMPNEWRS(cmpeqi_t,"cmp.eq(Ns8.new,#U5)",(fNEWREG(NsN)==(UiV))) +STD_CMPJUMPNEWRS(cmpeqi_f,"!cmp.eq(Ns8.new,#U5)",(fNEWREG(NsN)!=(UiV))) +STD_CMPJUMPNEWRS(cmpgti_t,"cmp.gt(Ns8.new,#U5)",(fNEWREG(NsN)>(UiV))) +STD_CMPJUMPNEWRS(cmpgti_f,"!cmp.gt(Ns8.new,#U5)",!(fNEWREG(NsN)>(UiV))) +STD_CMPJUMPNEWRS(cmpgtui_t,"cmp.gtu(Ns8.new,#U5)",(fCAST4u(fNEWREG(NsN))>(UiV))) +STD_CMPJUMPNEWRS(cmpgtui_f,"!cmp.gtu(Ns8.new,#U5)",!(fCAST4u(fNEWREG(NsN))>(UiV))) + + +STD_CMPJUMPNEWRS(cmpeqn1_t,"cmp.eq(Ns8.new,#-1)",(fNEWREG(NsN)==(-1))) +STD_CMPJUMPNEWRS(cmpeqn1_f,"!cmp.eq(Ns8.new,#-1)",(fNEWREG(NsN)!=(-1))) +STD_CMPJUMPNEWRS(cmpgtn1_t,"cmp.gt(Ns8.new,#-1)",(fNEWREG(NsN)>(-1))) +STD_CMPJUMPNEWRS(cmpgtn1_f,"!cmp.gt(Ns8.new,#-1)",!(fNEWREG(NsN)>(-1))) +STD_CMPJUMPNEWRS(tstbit0_t,"tstbit(Ns8.new,#0)",((fNEWREG(NsN)) & 1)) +STD_CMPJUMPNEWRS(tstbit0_f,"!tstbit(Ns8.new,#0)",!((fNEWREG(NsN)) & 1)) + + +STD_CMPJUMPNEWRS(cmpeq_t, "cmp.eq(Ns8.new,Rt32)", (fNEWREG(NsN)==RtV)) +STD_CMPJUMPNEWRS(cmpgt_t, "cmp.gt(Ns8.new,Rt32)", (fNEWREG(NsN)>RtV)) +STD_CMPJUMPNEWRS(cmpgtu_t,"cmp.gtu(Ns8.new,Rt32)",(fCAST4u(fNEWREG(NsN))>fCAST4u(RtV))) +STD_CMPJUMPNEWRS(cmplt_t, "cmp.gt(Rt32,Ns8.new)", (RtV>fNEWREG(NsN))) +STD_CMPJUMPNEWRS(cmpltu_t,"cmp.gtu(Rt32,Ns8.new)",(fCAST4u(RtV)>fCAST4u(fNEWREG(NsN)))) +STD_CMPJUMPNEWRS(cmpeq_f, "!cmp.eq(Ns8.new,Rt32)", (fNEWREG(NsN)!=RtV)) +STD_CMPJUMPNEWRS(cmpgt_f, "!cmp.gt(Ns8.new,Rt32)", !(fNEWREG(NsN)>RtV)) +STD_CMPJUMPNEWRS(cmpgtu_f,"!cmp.gtu(Ns8.new,Rt32)",!(fCAST4u(fNEWREG(NsN))>fCAST4u(RtV))) +STD_CMPJUMPNEWRS(cmplt_f, "!cmp.gt(Rt32,Ns8.new)", !(RtV>fNEWREG(NsN))) +STD_CMPJUMPNEWRS(cmpltu_f,"!cmp.gtu(Rt32,Ns8.new)",!(fCAST4u(RtV)>fCAST4u(fNEWREG(NsN)))) + + + + + +/*********************************************/ +/* Subroutine Call instructions */ +/*********************************************/ + +#define CALL_NOTES A_NOTE_PACKET_PC,A_NOTE_PACKET_NPC,A_NOTE_RELATIVE_ADDRESS +#define CDIR_STD A_CALL,A_DIRECT,CALL_NOTES,A_COFSTD,A_BRANCHADDER +#define CINDIR_STD A_CALL,A_INDIRECT,A_COFSTD + +Q6INSN(J2_call,"call #r22:2",ATTRIBS(CDIR_STD,A_RELAX_UNC), "direct unconditional call", +{fIMMEXT(riV); fPCALIGN(riV); fCALL(fREAD_PC()+riV); }) + +Q6INSN(J2_callt,"if (Pu4) call #r15:2",ATTRIBS(CDIR_STD,A_NOTE_CONDITIONAL,A_RELAX_COND,A_PRED_BIT_12),"direct conditional call if true", +{fIMMEXT(riV); fPCALIGN(riV); fBRANCH_SPECULATE_STALL(fLSBOLD(PuV),,SPECULATE_NOT_TAKEN,12,0); if (fLSBOLD(PuV)) { fCALL(fREAD_PC()+riV); }}) + +Q6INSN(J2_callf,"if (!Pu4) call #r15:2",ATTRIBS(CDIR_STD,A_NOTE_CONDITIONAL,A_RELAX_COND,A_PRED_BIT_12),"direct conditional call if false", +{fIMMEXT(riV); fPCALIGN(riV); fBRANCH_SPECULATE_STALL(fLSBOLDNOT(PuV),,SPECULATE_NOT_TAKEN,12,0);if (fLSBOLDNOT(PuV)) { fCALL(fREAD_PC()+riV); }}) + +Q6INSN(J2_callr,"callr Rs32",ATTRIBS(CINDIR_STD), "indirect unconditional call", +{ fCALLR(RsV); }) + +Q6INSN(J2_callrt,"if (Pu4) callr Rs32",ATTRIBS(CINDIR_STD,A_NOTE_CONDITIONAL,A_PRED_BIT_12),"indirect conditional call if true", +{fBRANCH_SPECULATE_STALL(fLSBOLD(PuV),,SPECULATE_NOT_TAKEN,12,0);if (fLSBOLD(PuV)) { fCALLR(RsV); }}) + +Q6INSN(J2_callrf,"if (!Pu4) callr Rs32",ATTRIBS(CINDIR_STD,A_NOTE_CONDITIONAL,A_PRED_BIT_12),"indirect conditional call if false", +{fBRANCH_SPECULATE_STALL(fLSBOLDNOT(PuV),,SPECULATE_NOT_TAKEN,12,0);if (fLSBOLDNOT(PuV)) { fCALLR(RsV); }}) + + + + +#define ATTRIB_HWLOOP_SETUP A_NOTE_PACKET_NPC,A_NOTE_RELATIVE_ADDRESS,A_NOTE_PACKET_PC,A_NOTE_LA_RESTRICT,A_RESTRICT_LOOP_LA,A_BRANCHADDER,A_IT_HWLOOP,A_RELAX_COF_1ST,A_RELAX_COF_2ND +#define ATTRIB_HWLOOP A_NOTE_PACKET_NPC,A_NOTE_PACKET_PC,A_RESTRICT_NOCOF,A_NOTE_NOCOF_RESTRICT,A_ROPS_3 + +/*********************************************/ +/* HW Loop instructions */ +/*********************************************/ + +Q6INSN(J2_loop0r,"loop0(#r7:2,Rs32)",ATTRIBS(ATTRIB_HWLOOP_SETUP,A_HWLOOP0_SETUP),"Initialize HW loop 0", +{ fIMMEXT(riV); fPCALIGN(riV); + fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, RsV); + fSET_LPCFG(0); +}) + +Q6INSN(J2_loop1r,"loop1(#r7:2,Rs32)",ATTRIBS(ATTRIB_HWLOOP_SETUP,A_HWLOOP1_SETUP),"Initialize HW loop 1", +{ fIMMEXT(riV); fPCALIGN(riV); + fWRITE_LOOP_REGS1(/*sa,lc*/ fREAD_PC()+riV, RsV); +}) + +Q6INSN(J2_loop0i,"loop0(#r7:2,#U10)",ATTRIBS(ATTRIB_HWLOOP_SETUP,A_HWLOOP0_SETUP),"Initialize HW loop 0", +{ fIMMEXT(riV); fPCALIGN(riV); + fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, UiV); + fSET_LPCFG(0); +}) + +Q6INSN(J2_loop1i,"loop1(#r7:2,#U10)",ATTRIBS(ATTRIB_HWLOOP_SETUP,A_HWLOOP1_SETUP),"Initialize HW loop 1", +{ fIMMEXT(riV); fPCALIGN(riV); + fWRITE_LOOP_REGS1(/*sa,lc*/ fREAD_PC()+riV, UiV); +}) + + +Q6INSN(J2_ploop1sr,"p3=sp1loop0(#r7:2,Rs32)",ATTRIBS(ATTRIB_HWLOOP_SETUP,A_ARCHV2,A_RESTRICT_LATEPRED,A_NOTE_LATEPRED,A_HWLOOP0_SETUP),"Initialize HW loop 0", +{ fIMMEXT(riV); fPCALIGN(riV); + fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, RsV); + fSET_LPCFG(1); + fWRITE_P3_LATE(0); +}) +Q6INSN(J2_ploop1si,"p3=sp1loop0(#r7:2,#U10)",ATTRIBS(ATTRIB_HWLOOP_SETUP,A_ARCHV2,A_RESTRICT_LATEPRED,A_NOTE_LATEPRED,A_HWLOOP0_SETUP),"Initialize HW loop 0", +{ fIMMEXT(riV); fPCALIGN(riV); + fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, UiV); + fSET_LPCFG(1); + fWRITE_P3_LATE(0); +}) + +Q6INSN(J2_ploop2sr,"p3=sp2loop0(#r7:2,Rs32)",ATTRIBS(ATTRIB_HWLOOP_SETUP,A_ARCHV2,A_RESTRICT_LATEPRED,A_NOTE_LATEPRED,A_HWLOOP0_SETUP),"Initialize HW loop 0", +{ fIMMEXT(riV); fPCALIGN(riV); + fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, RsV); + fSET_LPCFG(2); + fWRITE_P3_LATE(0); +}) +Q6INSN(J2_ploop2si,"p3=sp2loop0(#r7:2,#U10)",ATTRIBS(ATTRIB_HWLOOP_SETUP,A_ARCHV2,A_RESTRICT_LATEPRED,A_NOTE_LATEPRED,A_HWLOOP0_SETUP),"Initialize HW loop 0", +{ fIMMEXT(riV); fPCALIGN(riV); + fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, UiV); + fSET_LPCFG(2); + fWRITE_P3_LATE(0); +}) + +Q6INSN(J2_ploop3sr,"p3=sp3loop0(#r7:2,Rs32)",ATTRIBS(ATTRIB_HWLOOP_SETUP,A_ARCHV2,A_RESTRICT_LATEPRED,A_NOTE_LATEPRED,A_HWLOOP0_SETUP),"Initialize HW loop 0", +{ fIMMEXT(riV); fPCALIGN(riV); + fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, RsV); + fSET_LPCFG(3); + fWRITE_P3_LATE(0); +}) +Q6INSN(J2_ploop3si,"p3=sp3loop0(#r7:2,#U10)",ATTRIBS(ATTRIB_HWLOOP_SETUP,A_ARCHV2,A_RESTRICT_LATEPRED,A_NOTE_LATEPRED,A_HWLOOP0_SETUP),"Initialize HW loop 0", +{ fIMMEXT(riV); fPCALIGN(riV); + fWRITE_LOOP_REGS0(/*sa,lc*/ fREAD_PC()+riV, UiV); + fSET_LPCFG(3); + fWRITE_P3_LATE(0); +}) + + + +Q6INSN(J2_endloop01,"endloop01",ATTRIBS(ATTRIB_HWLOOP,A_HWLOOP0_END,A_HWLOOP1_END,A_RESTRICT_LATEPRED),"Loopend for inner and outer loop", +{ + + /* V2: With predicate control */ + if (fGET_LPCFG) { + fHIDE( if (fGET_LPCFG >= 2) { fWRITE_P3(fNOATTRIB_READ_P3()); } else ) + if (fGET_LPCFG==1) { + fWRITE_P3(0xff); + } + fSET_LPCFG(fGET_LPCFG-1); + fHIDE(MARK_LATE_PRED_WRITE(3)) + } + + /* check if iterate */ + if (fREAD_LC0>1) { + fBRANCH(fREAD_SA0,COF_TYPE_LOOPEND0); + fHIDE(fLOOPSTATS(fREAD_SA0);) + /* decrement loop count */ + fWRITE_LC0(fREAD_LC0-1); + } else { + /* check if iterate */ + if (fREAD_LC1>1) { + fBRANCH(fREAD_SA1,COF_TYPE_LOOPEND1); + fHIDE(fLOOPSTATS(fREAD_SA1);) + /* decrement loop count */ + fWRITE_LC1(fREAD_LC1-1); + } + } + +}) + +Q6INSN(J2_endloop0,"endloop0",ATTRIBS(ATTRIB_HWLOOP,A_HWLOOP0_END,A_RESTRICT_LATEPRED),"Loopend for inner loop", +{ + + /* V2: With predicate control */ + if (fGET_LPCFG) { + fHIDE( if (fGET_LPCFG >= 2) { fWRITE_P3(fNOATTRIB_READ_P3()); } else ) + if (fGET_LPCFG==1) { + fWRITE_P3(0xff); + } + fHIDE(MARK_LATE_PRED_WRITE(3)) + fSET_LPCFG(fGET_LPCFG-1); + } + + /* check if iterate */ + if (fREAD_LC0>1) { + fBRANCH(fREAD_SA0,COF_TYPE_LOOPEND0); + fHIDE(fLOOPSTATS(fREAD_SA0);) + /* decrement loop count */ + fWRITE_LC0(fREAD_LC0-1); + } +}) + +Q6INSN(J2_endloop1,"endloop1",ATTRIBS(ATTRIB_HWLOOP,A_HWLOOP1_END),"Loopend for outer loop", +{ + /* check if iterate */ + if (fREAD_LC1>1) { + fBRANCH(fREAD_SA1,COF_TYPE_LOOPEND1); + fHIDE(fLOOPSTATS(fREAD_SA1);) + /* decrement loop count */ + fWRITE_LC1(fREAD_LC1-1); + } +}) + + diff --git a/target/hexagon/imported/compare.idef b/target/hexagon/imported/compare.idef new file mode 100644 index 0000000..2efb95d --- /dev/null +++ b/target/hexagon/imported/compare.idef @@ -0,0 +1,639 @@ +/* + * Copyright (c) 2019 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/* + * Compare Instructions + */ + + + +/*********************************************/ +/* Scalar compare instructions */ +/*********************************************/ + +Q6INSN(C2_cmpeq,"Pd4=cmp.eq(Rs32,Rt32)",ATTRIBS(), +"Compare for Equal", +{PdV=f8BITSOF(RsV==RtV);}) + +Q6INSN(C2_cmpgt,"Pd4=cmp.gt(Rs32,Rt32)",ATTRIBS(), +"Compare for signed Greater Than", +{PdV=f8BITSOF(RsV>RtV);}) + +Q6INSN(C2_cmpgtu,"Pd4=cmp.gtu(Rs32,Rt32)",ATTRIBS(), +"Compare for Greater Than Unsigned", +{PdV=f8BITSOF(fCAST4u(RsV)>fCAST4u(RtV));}) + +Q6INSN(C2_cmpeqp,"Pd4=cmp.eq(Rss32,Rtt32)",ATTRIBS(), +"Compare for Equal", +{PdV=f8BITSOF(RssV==RttV);}) + +Q6INSN(C2_cmpgtp,"Pd4=cmp.gt(Rss32,Rtt32)",ATTRIBS(), +"Compare for signed Greater Than", +{PdV=f8BITSOF(RssV>RttV);}) + +Q6INSN(C2_cmpgtup,"Pd4=cmp.gtu(Rss32,Rtt32)",ATTRIBS(), +"Compare for Greater Than Unsigned", +{PdV=f8BITSOF(fCAST8u(RssV)>fCAST8u(RttV));}) + + + + +/*********************************************/ +/* Compare and put result in GPR */ +/* typically for function I/O */ +/*********************************************/ + +Q6INSN(A4_rcmpeqi,"Rd32=cmp.eq(Rs32,#s8)",ATTRIBS(), +"Compare for Equal", +{fIMMEXT(siV); RdV=(RsV==siV); }) + +Q6INSN(A4_rcmpneqi,"Rd32=!cmp.eq(Rs32,#s8)",ATTRIBS(), +"Compare for Equal", +{fIMMEXT(siV); RdV=(RsV!=siV); }) + + +Q6INSN(A4_rcmpeq,"Rd32=cmp.eq(Rs32,Rt32)",ATTRIBS(), +"Compare for Equal", +{RdV=(RsV==RtV); }) + +Q6INSN(A4_rcmpneq,"Rd32=!cmp.eq(Rs32,Rt32)",ATTRIBS(), +"Compare for Equal", +{RdV=(RsV!=RtV); }) + + + +/*********************************************/ +/* Scalar compare instructions */ +/*********************************************/ + + +Q6INSN(C2_bitsset,"Pd4=bitsset(Rs32,Rt32)",ATTRIBS(A_ARCHV2), +"Compare for selected bits set", +{PdV=f8BITSOF((RsV&RtV)==RtV);}) + +Q6INSN(C2_bitsclr,"Pd4=bitsclr(Rs32,Rt32)",ATTRIBS(A_ARCHV2), +"Compare for selected bits clear", +{PdV=f8BITSOF((RsV&RtV)==0);}) + + +Q6INSN(C4_nbitsset,"Pd4=!bitsset(Rs32,Rt32)",ATTRIBS(A_ARCHV2), +"Compare for selected bits set", +{PdV=f8BITSOF((RsV&RtV)!=RtV);}) + +Q6INSN(C4_nbitsclr,"Pd4=!bitsclr(Rs32,Rt32)",ATTRIBS(A_ARCHV2), +"Compare for selected bits clear", +{PdV=f8BITSOF((RsV&RtV)!=0);}) + + + +/*********************************************/ +/* Scalar compare instructions W/ immediate */ +/*********************************************/ + +Q6INSN(C2_cmpeqi,"Pd4=cmp.eq(Rs32,#s10)",ATTRIBS(), +"Compare for Equal", +{fIMMEXT(siV); PdV=f8BITSOF(RsV==siV);}) + +Q6INSN(C2_cmpgti,"Pd4=cmp.gt(Rs32,#s10)",ATTRIBS(), +"Compare for signed Greater Than", +{fIMMEXT(siV); PdV=f8BITSOF(RsV>siV);}) + +Q6INSN(C2_cmpgtui,"Pd4=cmp.gtu(Rs32,#u9)",ATTRIBS(), +"Compare for Greater Than Unsigned", +{fIMMEXT(uiV); PdV=f8BITSOF(fCAST4u(RsV)>fCAST4u(uiV));}) + +DEF_MAPPING(C2_cmpgei,"Pd4=cmp.ge(Rs32,#s8)","Pd4=cmp.gt(Rs32,#s8-1)") +DEF_COND_MAPPING(C2_cmpgeui,"Pd4=cmp.geu(Rs32,#u8)","#u8==0","Pd4=cmp.eq(Rs32,Rs32)","Pd4=cmp.gtu(Rs32,#u8-1)") + +/* Just for fun */ +DEF_V2_MAPPING(C2_cmplt,"Pd4=cmp.lt(Rs32,Rt32)","Pd4=cmp.gt(Rt32,Rs32)") +DEF_V2_MAPPING(C2_cmpltu,"Pd4=cmp.ltu(Rs32,Rt32)","Pd4=cmp.gtu(Rt32,Rs32)") + + +Q6INSN(C2_bitsclri,"Pd4=bitsclr(Rs32,#u6)",ATTRIBS(A_ARCHV2), +"Compare for selected bits clear", +{PdV=f8BITSOF((RsV&uiV)==0);}) + +Q6INSN(C4_nbitsclri,"Pd4=!bitsclr(Rs32,#u6)",ATTRIBS(A_ARCHV2), +"Compare for selected bits clear", +{PdV=f8BITSOF((RsV&uiV)!=0);}) + + + + +Q6INSN(C4_cmpneqi,"Pd4=!cmp.eq(Rs32,#s10)",ATTRIBS(), "Compare for Not Equal", {fIMMEXT(siV); PdV=f8BITSOF(RsV!=siV);}) +Q6INSN(C4_cmpltei,"Pd4=!cmp.gt(Rs32,#s10)",ATTRIBS(), "Compare for Less Than or Equal", {fIMMEXT(siV); PdV=f8BITSOF(RsV<=siV);}) +Q6INSN(C4_cmplteui,"Pd4=!cmp.gtu(Rs32,#u9)",ATTRIBS(), "Compare for Less Than or Equal Unsigned", {fIMMEXT(uiV); PdV=f8BITSOF(fCAST4u(RsV)<=fCAST4u(uiV));}) + +Q6INSN(C4_cmpneq,"Pd4=!cmp.eq(Rs32,Rt32)",ATTRIBS(), "And-Compare for Equal", {PdV=f8BITSOF(RsV!=RtV);}) +Q6INSN(C4_cmplte,"Pd4=!cmp.gt(Rs32,Rt32)",ATTRIBS(), "And-Compare for signed Greater Than", {PdV=f8BITSOF(RsV<=RtV);}) +Q6INSN(C4_cmplteu,"Pd4=!cmp.gtu(Rs32,Rt32)",ATTRIBS(), "And-Compare for Greater Than Unsigned", {PdV=f8BITSOF(fCAST4u(RsV)<=fCAST4u(RtV));}) + + + + + +/* Predicate Logical Operations */ + +Q6INSN(C2_and,"Pd4=and(Pt4,Ps4)",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Predicate AND", +{fPREDUSE_TIMING();PdV=PsV & PtV;}) + +Q6INSN(C2_or,"Pd4=or(Pt4,Ps4)",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Predicate OR", +{fPREDUSE_TIMING();PdV=PsV | PtV;}) + +Q6INSN(C2_xor,"Pd4=xor(Ps4,Pt4)",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Predicate XOR", +{fPREDUSE_TIMING();PdV=PsV ^ PtV;}) + +Q6INSN(C2_andn,"Pd4=and(Pt4,!Ps4)",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Predicate AND NOT", +{fPREDUSE_TIMING();PdV=PtV & (~PsV);}) + +Q6INSN(C2_not,"Pd4=not(Ps4)",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Logical NOT Predicate", +{fPREDUSE_TIMING();PdV=~PsV;}) + +Q6INSN(C2_orn,"Pd4=or(Pt4,!Ps4)",ATTRIBS(A_ARCHV2,A_CRSLOT23,A_NOTE_CRSLOT23), +"Predicate OR NOT", +{fPREDUSE_TIMING();PdV=PtV | (~PsV);}) + + + + + +Q6INSN(C4_and_and,"Pd4=and(Ps4,and(Pt4,Pu4))",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Compound And-And", { fPREDUSE_TIMING();PdV = PsV & PtV & PuV; }) + +Q6INSN(C4_and_or,"Pd4=and(Ps4,or(Pt4,Pu4))",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Compound And-Or", { fPREDUSE_TIMING();PdV = PsV & (PtV | PuV); }) + +Q6INSN(C4_or_and,"Pd4=or(Ps4,and(Pt4,Pu4))",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Compound Or-And", { fPREDUSE_TIMING();PdV = PsV | (PtV & PuV); }) + +Q6INSN(C4_or_or,"Pd4=or(Ps4,or(Pt4,Pu4))",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Compound Or-Or", { fPREDUSE_TIMING();PdV = PsV | PtV | PuV; }) + + + +Q6INSN(C4_and_andn,"Pd4=and(Ps4,and(Pt4,!Pu4))",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Compound And-And", { fPREDUSE_TIMING();PdV = PsV & PtV & (~PuV); }) + +Q6INSN(C4_and_orn,"Pd4=and(Ps4,or(Pt4,!Pu4))",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Compound And-Or", { fPREDUSE_TIMING();PdV = PsV & (PtV | (~PuV)); }) + +Q6INSN(C4_or_andn,"Pd4=or(Ps4,and(Pt4,!Pu4))",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Compound Or-And", { fPREDUSE_TIMING();PdV = PsV | (PtV & (~PuV)); }) + +Q6INSN(C4_or_orn,"Pd4=or(Ps4,or(Pt4,!Pu4))",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Compound Or-Or", { fPREDUSE_TIMING();PdV = PsV | PtV | (~PuV); }) + + + + + +DEF_V2_MAPPING(C2_pxfer_map,"Pd4=Ps4","Pd4=or(Ps4,Ps4)") + + +Q6INSN(C2_any8,"Pd4=any8(Ps4)",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Logical ANY of low 8 predicate bits", +{ fPREDUSE_TIMING();PsV ? (PdV=0xff) : (PdV=0x00); }) + +Q6INSN(C2_all8,"Pd4=all8(Ps4)",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Logical ALL of low 8 predicate bits", +{ fPREDUSE_TIMING();(PsV==0xff) ? (PdV=0xff) : (PdV=0x00); }) + +Q6INSN(C2_vitpack,"Rd32=vitpack(Ps4,Pt4)",ATTRIBS(), +"Pack the odd and even bits of two predicate registers", +{ fPREDUSE_TIMING();RdV = (PsV&0x55) | (PtV&0xAA); }) + +/* Mux instructions */ + +Q6INSN(C2_mux,"Rd32=mux(Pu4,Rs32,Rt32)",ATTRIBS(), +"Scalar MUX", +{ fPREDUSE_TIMING();(fLSBOLD(PuV)) ? (RdV=RsV):(RdV=RtV); }) + + +Q6INSN(C2_cmovenewit,"if (Pu4.new) Rd32=#s12",ATTRIBS(A_ARCHV2), +"Scalar conditional move", +{ fIMMEXT(siV); if (fLSBNEW(PuN)) RdV=siV; else CANCEL;}) + +Q6INSN(C2_cmovenewif,"if (!Pu4.new) Rd32=#s12",ATTRIBS(A_ARCHV2), +"Scalar conditional move", +{ fIMMEXT(siV); if (fLSBNEWNOT(PuN)) RdV=siV; else CANCEL;}) + +Q6INSN(C2_cmoveit,"if (Pu4) Rd32=#s12",ATTRIBS(A_ARCHV2), +"Scalar conditional move", +{ fIMMEXT(siV); if (fLSBOLD(PuV)) RdV=siV; else CANCEL;}) + +Q6INSN(C2_cmoveif,"if (!Pu4) Rd32=#s12",ATTRIBS(A_ARCHV2), +"Scalar conditional move", +{ fIMMEXT(siV); if (fLSBOLDNOT(PuV)) RdV=siV; else CANCEL;}) + + + +Q6INSN(C2_ccombinewnewt,"if (Pu4.new) Rdd32=combine(Rs32,Rt32)",ATTRIBS(A_ROPS_2,A_ARCHV2), +"Conditionally combine two words into a register pair", +{ if (fLSBNEW(PuN)) { + fSETWORD(0,RddV,RtV); + fSETWORD(1,RddV,RsV); + } else {CANCEL;} +}) + +Q6INSN(C2_ccombinewnewf,"if (!Pu4.new) Rdd32=combine(Rs32,Rt32)",ATTRIBS(A_ARCHV2,A_ROPS_2), +"Conditionally combine two words into a register pair", +{ if (fLSBNEWNOT(PuN)) { + fSETWORD(0,RddV,RtV); + fSETWORD(1,RddV,RsV); + } else {CANCEL;} +}) + +Q6INSN(C2_ccombinewt,"if (Pu4) Rdd32=combine(Rs32,Rt32)",ATTRIBS(A_ARCHV2,A_ROPS_2), +"Conditionally combine two words into a register pair", +{ if (fLSBOLD(PuV)) { + fSETWORD(0,RddV,RtV); + fSETWORD(1,RddV,RsV); + } else {CANCEL;} +}) + +Q6INSN(C2_ccombinewf,"if (!Pu4) Rdd32=combine(Rs32,Rt32)",ATTRIBS(A_ARCHV2,A_ROPS_2), +"Conditionally combine two words into a register pair", +{ if (fLSBOLDNOT(PuV)) { + fSETWORD(0,RddV,RtV); + fSETWORD(1,RddV,RsV); + } else {CANCEL;} +}) + + + +Q6INSN(C2_muxii,"Rd32=mux(Pu4,#s8,#S8)",ATTRIBS(A_ARCHV2), +"Scalar MUX immediates", +{ fPREDUSE_TIMING();fIMMEXT(siV); (fLSBOLD(PuV)) ? (RdV=siV):(RdV=SiV); }) + + + +Q6INSN(C2_muxir,"Rd32=mux(Pu4,Rs32,#s8)",ATTRIBS(A_ARCHV2), +"Scalar MUX register immediate", +{ fPREDUSE_TIMING();fIMMEXT(siV); (fLSBOLD(PuV)) ? (RdV=RsV):(RdV=siV); }) + + +Q6INSN(C2_muxri,"Rd32=mux(Pu4,#s8,Rs32)",ATTRIBS(A_ARCHV2), +"Scalar MUX register immediate", +{ fPREDUSE_TIMING();fIMMEXT(siV); (fLSBOLD(PuV)) ? (RdV=siV):(RdV=RsV); }) + + + +Q6INSN(C2_vmux,"Rdd32=vmux(Pu4,Rss32,Rtt32)",ATTRIBS(), +"Vector MUX", +{ fPREDUSE_TIMING(); + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBYTE(i,RddV,(fGETBIT(i,PuV)?(fGETBYTE(i,RssV)):(fGETBYTE(i,RttV)))); + } +}) + +Q6INSN(C2_mask,"Rdd32=mask(Pt4)",ATTRIBS(), +"Vector Mask Generation", +{ fPREDUSE_TIMING(); + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBYTE(i,RddV,(fGETBIT(i,PtV)?(0xff):(0x00))); + } +}) + +/* VCMP */ + +Q6INSN(A2_vcmpbeq,"Pd4=vcmpb.eq(Rss32,Rtt32)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBIT(i,PdV,(fGETBYTE(i,RssV) == fGETBYTE(i,RttV))); + } +}) + +Q6INSN(A4_vcmpbeqi,"Pd4=vcmpb.eq(Rss32,#u8)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBIT(i,PdV,(fGETUBYTE(i,RssV) == uiV)); + } +}) + +Q6INSN(A4_vcmpbeq_any,"Pd4=any8(vcmpb.eq(Rss32,Rtt32))",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + PdV = 0; + for (i = 0; i < 8; i++) { + if (fGETBYTE(i,RssV) == fGETBYTE(i,RttV)) PdV = 0xff; + } +}) + +Q6INSN(A6_vcmpbeq_notany,"Pd4=!any8(vcmpb.eq(Rss32,Rtt32))",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + PdV = 0; + for (i = 0; i < 8; i++) { + if (fGETBYTE(i,RssV) == fGETBYTE(i,RttV)) PdV = 0xff; + } + PdV = ~PdV; +}) + +Q6INSN(A2_vcmpbgtu,"Pd4=vcmpb.gtu(Rss32,Rtt32)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBIT(i,PdV,(fGETUBYTE(i,RssV) > fGETUBYTE(i,RttV))); + } +}) + +Q6INSN(A4_vcmpbgtui,"Pd4=vcmpb.gtu(Rss32,#u7)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBIT(i,PdV,(fGETUBYTE(i,RssV) > uiV)); + } +}) + +Q6INSN(A4_vcmpbgt,"Pd4=vcmpb.gt(Rss32,Rtt32)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBIT(i,PdV,(fGETBYTE(i,RssV) > fGETBYTE(i,RttV))); + } +}) + +Q6INSN(A4_vcmpbgti,"Pd4=vcmpb.gt(Rss32,#s8)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + for (i = 0; i < 8; i++) { + fSETBIT(i,PdV,(fGETBYTE(i,RssV) > siV)); + } +}) + + + +Q6INSN(A4_cmpbeq,"Pd4=cmpb.eq(Rs32,Rt32)",ATTRIBS(), +"Compare bytes ", +{ + PdV=f8BITSOF(fGETBYTE(0,RsV) == fGETBYTE(0,RtV)); +}) + +Q6INSN(A4_cmpbeqi,"Pd4=cmpb.eq(Rs32,#u8)",ATTRIBS(), +"Compare bytes ", +{ + PdV=f8BITSOF(fGETUBYTE(0,RsV) == uiV); +}) + +Q6INSN(A4_cmpbgtu,"Pd4=cmpb.gtu(Rs32,Rt32)",ATTRIBS(), +"Compare bytes ", +{ + PdV=f8BITSOF(fGETUBYTE(0,RsV) > fGETUBYTE(0,RtV)); +}) + +Q6INSN(A4_cmpbgtui,"Pd4=cmpb.gtu(Rs32,#u7)",ATTRIBS(), +"Compare bytes ", +{ + fIMMEXT(uiV); + PdV=f8BITSOF(fGETUBYTE(0,RsV) > fCAST4u(uiV)); +}) + +Q6INSN(A4_cmpbgt,"Pd4=cmpb.gt(Rs32,Rt32)",ATTRIBS(), +"Compare bytes ", +{ + PdV=f8BITSOF(fGETBYTE(0,RsV) > fGETBYTE(0,RtV)); +}) + +Q6INSN(A4_cmpbgti,"Pd4=cmpb.gt(Rs32,#s8)",ATTRIBS(), +"Compare bytes ", +{ + PdV=f8BITSOF(fGETBYTE(0,RsV) > siV); +}) + +Q6INSN(A2_vcmpheq,"Pd4=vcmph.eq(Rss32,Rtt32)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + for (i = 0; i < 4; i++) { + fSETBIT(i*2,PdV, (fGETHALF(i,RssV) == fGETHALF(i,RttV))); + fSETBIT(i*2+1,PdV,(fGETHALF(i,RssV) == fGETHALF(i,RttV))); + } +}) + +Q6INSN(A2_vcmphgt,"Pd4=vcmph.gt(Rss32,Rtt32)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + for (i = 0; i < 4; i++) { + fSETBIT(i*2, PdV, (fGETHALF(i,RssV) > fGETHALF(i,RttV))); + fSETBIT(i*2+1,PdV, (fGETHALF(i,RssV) > fGETHALF(i,RttV))); + } +}) + +Q6INSN(A2_vcmphgtu,"Pd4=vcmph.gtu(Rss32,Rtt32)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + for (i = 0; i < 4; i++) { + fSETBIT(i*2, PdV, (fGETUHALF(i,RssV) > fGETUHALF(i,RttV))); + fSETBIT(i*2+1,PdV, (fGETUHALF(i,RssV) > fGETUHALF(i,RttV))); + } +}) + +Q6INSN(A4_vcmpheqi,"Pd4=vcmph.eq(Rss32,#s8)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + for (i = 0; i < 4; i++) { + fSETBIT(i*2,PdV, (fGETHALF(i,RssV) == siV)); + fSETBIT(i*2+1,PdV,(fGETHALF(i,RssV) == siV)); + } +}) + +Q6INSN(A4_vcmphgti,"Pd4=vcmph.gt(Rss32,#s8)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + for (i = 0; i < 4; i++) { + fSETBIT(i*2, PdV, (fGETHALF(i,RssV) > siV)); + fSETBIT(i*2+1,PdV, (fGETHALF(i,RssV) > siV)); + } +}) + + +Q6INSN(A4_vcmphgtui,"Pd4=vcmph.gtu(Rss32,#u7)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fHIDE(int i;) + for (i = 0; i < 4; i++) { + fSETBIT(i*2, PdV, (fGETUHALF(i,RssV) > uiV)); + fSETBIT(i*2+1,PdV, (fGETUHALF(i,RssV) > uiV)); + } +}) + +Q6INSN(A4_cmpheq,"Pd4=cmph.eq(Rs32,Rt32)",ATTRIBS(), +"Compare halfwords ", +{ + PdV=f8BITSOF(fGETHALF(0,RsV) == fGETHALF(0,RtV)); +}) + +Q6INSN(A4_cmphgt,"Pd4=cmph.gt(Rs32,Rt32)",ATTRIBS(), +"Compare halfwords ", +{ + PdV=f8BITSOF(fGETHALF(0,RsV) > fGETHALF(0,RtV)); +}) + +Q6INSN(A4_cmphgtu,"Pd4=cmph.gtu(Rs32,Rt32)",ATTRIBS(), +"Compare halfwords ", +{ + PdV=f8BITSOF(fGETUHALF(0,RsV) > fGETUHALF(0,RtV)); +}) + +Q6INSN(A4_cmpheqi,"Pd4=cmph.eq(Rs32,#s8)",ATTRIBS(), +"Compare halfwords ", +{ + fIMMEXT(siV); + PdV=f8BITSOF(fGETHALF(0,RsV) == siV); +}) + +Q6INSN(A4_cmphgti,"Pd4=cmph.gt(Rs32,#s8)",ATTRIBS(), +"Compare halfwords ", +{ + fIMMEXT(siV); + PdV=f8BITSOF(fGETHALF(0,RsV) > siV); +}) + +Q6INSN(A4_cmphgtui,"Pd4=cmph.gtu(Rs32,#u7)",ATTRIBS(), +"Compare halfwords ", +{ + fIMMEXT(uiV); + PdV=f8BITSOF(fGETUHALF(0,RsV) > fCAST4u(uiV)); +}) + +Q6INSN(A2_vcmpweq,"Pd4=vcmpw.eq(Rss32,Rtt32)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fSETBITS(3,0,PdV,(fGETWORD(0,RssV)==fGETWORD(0,RttV))); + fSETBITS(7,4,PdV,(fGETWORD(1,RssV)==fGETWORD(1,RttV))); +}) + +Q6INSN(A2_vcmpwgt,"Pd4=vcmpw.gt(Rss32,Rtt32)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fSETBITS(3,0,PdV,(fGETWORD(0,RssV)>fGETWORD(0,RttV))); + fSETBITS(7,4,PdV,(fGETWORD(1,RssV)>fGETWORD(1,RttV))); +}) + +Q6INSN(A2_vcmpwgtu,"Pd4=vcmpw.gtu(Rss32,Rtt32)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fSETBITS(3,0,PdV,(fGETUWORD(0,RssV)>fGETUWORD(0,RttV))); + fSETBITS(7,4,PdV,(fGETUWORD(1,RssV)>fGETUWORD(1,RttV))); +}) + +Q6INSN(A4_vcmpweqi,"Pd4=vcmpw.eq(Rss32,#s8)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fSETBITS(3,0,PdV,(fGETWORD(0,RssV)==siV)); + fSETBITS(7,4,PdV,(fGETWORD(1,RssV)==siV)); +}) + +Q6INSN(A4_vcmpwgti,"Pd4=vcmpw.gt(Rss32,#s8)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fSETBITS(3,0,PdV,(fGETWORD(0,RssV)>siV)); + fSETBITS(7,4,PdV,(fGETWORD(1,RssV)>siV)); +}) + +Q6INSN(A4_vcmpwgtui,"Pd4=vcmpw.gtu(Rss32,#u7)",ATTRIBS(), +"Compare elements of two vectors ", +{ + fSETBITS(3,0,PdV,(fGETUWORD(0,RssV)>fCAST4u(uiV))); + fSETBITS(7,4,PdV,(fGETUWORD(1,RssV)>fCAST4u(uiV))); +}) + +Q6INSN(A4_boundscheck_hi,"Pd4=boundscheck(Rss32,Rtt32):raw:hi",ATTRIBS(), +"Detect if a register is within bounds", +{ + fHIDE(size4u_t src;) + src = fGETUWORD(1,RssV); + PdV = f8BITSOF((fCAST4u(src) >= fGETUWORD(0,RttV)) && (fCAST4u(src) < fGETUWORD(1,RttV))); +}) + +Q6INSN(A4_boundscheck_lo,"Pd4=boundscheck(Rss32,Rtt32):raw:lo",ATTRIBS(), +"Detect if a register is within bounds", +{ + fHIDE(size4u_t src;) + src = fGETUWORD(0,RssV); + PdV = f8BITSOF((fCAST4u(src) >= fGETUWORD(0,RttV)) && (fCAST4u(src) < fGETUWORD(1,RttV))); +}) + +DEF_V4_COND_MAPPING(A4_boundscheck,"Pd4=boundscheck(Rs32,Rtt32)","Rs32 & 1","Pd4=boundscheck(Rss32,Rtt32):raw:hi","Pd4=boundscheck(Rss32,Rtt32):raw:lo") + +Q6INSN(A4_tlbmatch,"Pd4=tlbmatch(Rss32,Rt32)",ATTRIBS(A_NOTE_LATEPRED,A_RESTRICT_LATEPRED), +"Detect if a VA/ASID matches a TLB entry", +{ + fHIDE(size4u_t TLBHI; size4u_t TLBLO; size4u_t MASK; size4u_t SIZE;) + MASK = 0x07ffffff; + TLBLO = fGETUWORD(0,RssV); + TLBHI = fGETUWORD(1,RssV); + SIZE = fMIN(6,fCL1_4(~fBREV_4(TLBLO))); + MASK &= (0xffffffff << 2*SIZE); + PdV = f8BITSOF(fGETBIT(31,TLBHI) && ((TLBHI & MASK) == (RtV & MASK))); + fHIDE(MARK_LATE_PRED_WRITE(PdN)) +}) + +Q6INSN(C2_tfrpr,"Rd32=Ps4",ATTRIBS(), +"Transfer predicate to general register", { fPREDUSE_TIMING();RdV = fZXTN(8,32,PsV); }) + +Q6INSN(C2_tfrrp,"Pd4=Rs32",ATTRIBS(), +"Transfer general register to Predicate", { PdV = fGETUBYTE(0,RsV); }) + +Q6INSN(C4_fastcorner9,"Pd4=fastcorner9(Ps4,Pt4)",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Determine whether the predicate sources define a corner", +{ fPREDUSE_TIMING(); + fHIDE(size4u_t tmp = 0; size4u_t i;) + fSETHALF(0,tmp,(PsV<<8)|PtV); + fSETHALF(1,tmp,(PsV<<8)|PtV); + for (i = 1; i < 9; i++) { + tmp &= tmp >> 1; + } + PdV = f8BITSOF(tmp != 0); +}) + +Q6INSN(C4_fastcorner9_not,"Pd4=!fastcorner9(Ps4,Pt4)",ATTRIBS(A_CRSLOT23,A_NOTE_CRSLOT23), +"Determine whether the predicate sources define a corner", +{ + fPREDUSE_TIMING(); + fHIDE(size4u_t tmp = 0; size4u_t i;) + fSETHALF(0,tmp,(PsV<<8)|PtV); + fSETHALF(1,tmp,(PsV<<8)|PtV); + for (i = 1; i < 9; i++) { + tmp &= tmp >> 1; + } + PdV = f8BITSOF(tmp == 0); +}) + + diff --git a/target/hexagon/imported/float.idef b/target/hexagon/imported/float.idef new file mode 100644 index 0000000..d3bc57d --- /dev/null +++ b/target/hexagon/imported/float.idef @@ -0,0 +1,498 @@ +/* + * Copyright (c) 2019 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/* + * Floating-Point Instructions + */ + +/*************************************/ +/* Scalar FP */ +/*************************************/ +Q6INSN(F2_sfadd,"Rd32=sfadd(Rs32,Rt32)",ATTRIBS(), +"Floating-Point Add", +{ RdV=fUNFLOAT(fFLOAT(RsV)+fFLOAT(RtV));}) + +Q6INSN(F2_sfsub,"Rd32=sfsub(Rs32,Rt32)",ATTRIBS(), +"Floating-Point Subtract", +{ RdV=fUNFLOAT(fFLOAT(RsV)-fFLOAT(RtV));}) + +Q6INSN(F2_sfmpy,"Rd32=sfmpy(Rs32,Rt32)",ATTRIBS(), +"Floating-Point Multiply", +{ RdV=fUNFLOAT(fSFMPY(fFLOAT(RsV),fFLOAT(RtV)));}) + +Q6INSN(F2_sffma,"Rx32+=sfmpy(Rs32,Rt32)",ATTRIBS(), +"Floating-Point Fused Multiply Add", +{ RxV=fUNFLOAT(fFMAF(fFLOAT(RsV),fFLOAT(RtV),fFLOAT(RxV)));}) + +Q6INSN(F2_sffma_sc,"Rx32+=sfmpy(Rs32,Rt32,Pu4):scale",ATTRIBS(), +"Floating-Point Fused Multiply Add w/ Additional Scaling (2**Pu)", +{ + fPREDUSE_TIMING(); + fHIDE(size4s_t tmp;) + fCHECKSFNAN3(RxV,RxV,RsV,RtV); + tmp=fUNFLOAT(fFMAFX(fFLOAT(RsV),fFLOAT(RtV),fFLOAT(RxV),PuV)); + if (!((fFLOAT(RxV) == 0.0) && fISZEROPROD(fFLOAT(RsV),fFLOAT(RtV)))) RxV = tmp; +}) + +Q6INSN(F2_sffms,"Rx32-=sfmpy(Rs32,Rt32)",ATTRIBS(), +"Floating-Point Fused Multiply Add", +{ RxV=fUNFLOAT(fFMAF(-fFLOAT(RsV),fFLOAT(RtV),fFLOAT(RxV))); }) + +Q6INSN(F2_sffma_lib,"Rx32+=sfmpy(Rs32,Rt32):lib",ATTRIBS(), +"Floating-Point Fused Multiply Add for Library Routines", +{ fFPSETROUND_NEAREST(); fHIDE(int infinp; int infminusinf; size4s_t tmp;) + infminusinf = ((isinf(fFLOAT(RxV))) && + (fISINFPROD(fFLOAT(RsV),fFLOAT(RtV))) && + (fGETBIT(31,RsV ^ RxV ^ RtV) != 0)); + infinp = (isinf(fFLOAT(RxV))) || (isinf(fFLOAT(RtV))) || (isinf(fFLOAT(RsV))); + fCHECKSFNAN3(RxV,RxV,RsV,RtV); + tmp=fUNFLOAT(fFMAF(fFLOAT(RsV),fFLOAT(RtV),fFLOAT(RxV))); + if (!((fFLOAT(RxV) == 0.0) && fISZEROPROD(fFLOAT(RsV),fFLOAT(RtV)))) RxV = tmp; + fFPCANCELFLAGS(); + if (isinf(fFLOAT(RxV)) && !infinp) RxV = RxV - 1; + if (infminusinf) RxV = 0; +}) + +Q6INSN(F2_sffms_lib,"Rx32-=sfmpy(Rs32,Rt32):lib",ATTRIBS(), +"Floating-Point Fused Multiply Add for Library Routines", +{ fFPSETROUND_NEAREST(); fHIDE(int infinp; int infminusinf; size4s_t tmp;) + infminusinf = ((isinf(fFLOAT(RxV))) && + (fISINFPROD(fFLOAT(RsV),fFLOAT(RtV))) && + (fGETBIT(31,RsV ^ RxV ^ RtV) == 0)); + infinp = (isinf(fFLOAT(RxV))) || (isinf(fFLOAT(RtV))) || (isinf(fFLOAT(RsV))); + fCHECKSFNAN3(RxV,RxV,RsV,RtV); + tmp=fUNFLOAT(fFMAF(-fFLOAT(RsV),fFLOAT(RtV),fFLOAT(RxV))); + if (!((fFLOAT(RxV) == 0.0) && fISZEROPROD(fFLOAT(RsV),fFLOAT(RtV)))) RxV = tmp; + fFPCANCELFLAGS(); + if (isinf(fFLOAT(RxV)) && !infinp) RxV = RxV - 1; + if (infminusinf) RxV = 0; +}) + + +Q6INSN(F2_sfcmpeq,"Pd4=sfcmp.eq(Rs32,Rt32)",ATTRIBS(), +"Floating Point Compare for Equal", +{PdV=f8BITSOF(fFLOAT(RsV)==fFLOAT(RtV));}) + +Q6INSN(F2_sfcmpgt,"Pd4=sfcmp.gt(Rs32,Rt32)",ATTRIBS(), +"Floating-Point Compare for Greater Than", +{PdV=f8BITSOF(fFLOAT(RsV)>fFLOAT(RtV));}) + +/* cmpge is not the same as !cmpgt(swapops) in IEEE */ + +Q6INSN(F2_sfcmpge,"Pd4=sfcmp.ge(Rs32,Rt32)",ATTRIBS(), +"Floating-Point Compare for Greater Than / Equal To", +{PdV=f8BITSOF(fFLOAT(RsV)>=fFLOAT(RtV));}) + +/* Everyone seems to have this... */ + +Q6INSN(F2_sfcmpuo,"Pd4=sfcmp.uo(Rs32,Rt32)",ATTRIBS(), +"Floating-Point Compare for Unordered", +{PdV=f8BITSOF(isunordered(fFLOAT(RsV),fFLOAT(RtV)));}) + + +Q6INSN(F2_sfmax,"Rd32=sfmax(Rs32,Rt32)",ATTRIBS(), +"Maximum of Floating-Point values", +{ RdV = fUNFLOAT(fSF_MAX(fFLOAT(RsV),fFLOAT(RtV))); }) + +Q6INSN(F2_sfmin,"Rd32=sfmin(Rs32,Rt32)",ATTRIBS(), +"Minimum of Floating-Point values", +{ RdV = fUNFLOAT(fSF_MIN(fFLOAT(RsV),fFLOAT(RtV))); }) + + +Q6INSN(F2_sfclass,"Pd4=sfclass(Rs32,#u5)",ATTRIBS(), +"Classify Floating-Point Value", +{ + fHIDE(int class;) + PdV = 0; + class = fpclassify(fFLOAT(RsV)); + /* Is the value zero? */ + if (fGETBIT(0,uiV) && (class == FP_ZERO)) PdV = 0xff; + if (fGETBIT(1,uiV) && (class == FP_NORMAL)) PdV = 0xff; + if (fGETBIT(2,uiV) && (class == FP_SUBNORMAL)) PdV = 0xff; + if (fGETBIT(3,uiV) && (class == FP_INFINITE)) PdV = 0xff; + if (fGETBIT(4,uiV) && (class == FP_NAN)) PdV = 0xff; + fFPCANCELFLAGS(); +}) + +/* Range: +/- (1.0 .. 1+(63/64)) * 2**(-6 .. +9) */ +/* More immediate bits should probably be used for more precision? */ + +Q6INSN(F2_sfimm_p,"Rd32=sfmake(#u10):pos",ATTRIBS(A_SFMAKE), +"Make Floating Point Value", +{ + RdV = (127 - 6) << 23; + RdV += uiV << 17; +}) + +Q6INSN(F2_sfimm_n,"Rd32=sfmake(#u10):neg",ATTRIBS(A_SFMAKE), +"Make Floating Point Value", +{ + RdV = (127 - 6) << 23; + RdV += (uiV << 17); + RdV |= (1 << 31); +}) + + +Q6INSN(F2_sfdivcheat,"Rd32=sfdivcheat(Rs32,Rt32)",ATTRIBS(A_FAKEINSN), +"Cheating Instruction for Divide Testing", +{ + RdV = fUNFLOAT(fFLOAT(RsV) / fFLOAT(RtV)); +}) + +Q6INSN(F2_sfrecipa,"Rd32,Pe4=sfrecipa(Rs32,Rt32)",ATTRIBS(A_NOTE_COMPAT_ACCURACY,A_RESTRICT_LATEPRED,A_NOTE_LATEPRED), +"Reciprocal Approximation for Division", +{ + fHIDE(int idx;) + fHIDE(int adjust;) + fHIDE(int mant;) + fHIDE(int exp;) + if (fSF_RECIP_COMMON(RsV,RtV,RdV,adjust)) { + PeV = adjust; + idx = (RtV >> 16) & 0x7f; + mant = (fSF_RECIP_LOOKUP(idx) << 15) | 1; + exp = fSF_BIAS() - (fSF_GETEXP(RtV) - fSF_BIAS()) - 1; + RdV = fMAKESF(fGETBIT(31,RtV),exp,mant); + } +}) + +Q6INSN(F2_sffixupn,"Rd32=sffixupn(Rs32,Rt32)",ATTRIBS(), +"Fix Up Numerator", +{ + fHIDE(int adjust;) + fSF_RECIP_COMMON(RsV,RtV,RdV,adjust); + RdV = RsV; +}) + +Q6INSN(F2_sffixupd,"Rd32=sffixupd(Rs32,Rt32)",ATTRIBS(), +"Fix Up Denominator", +{ + fHIDE(int adjust;) + fSF_RECIP_COMMON(RsV,RtV,RdV,adjust); + RdV = RtV; +}) + +Q6INSN(F2_sfinvsqrta,"Rd32,Pe4=sfinvsqrta(Rs32)",ATTRIBS(A_NOTE_COMPAT_ACCURACY,A_NOTE_LATEPRED,A_RESTRICT_LATEPRED), +"Reciprocal Square Root Approximation", +{ + fHIDE(int idx;) + fHIDE(int adjust;) + fHIDE(int mant;) + fHIDE(int exp;) + if (fSF_INVSQRT_COMMON(RsV,RdV,adjust)) { + PeV = adjust; + idx = (RsV >> 17) & 0x7f; + mant = (fSF_INVSQRT_LOOKUP(idx) << 15); + exp = fSF_BIAS() - ((fSF_GETEXP(RsV) - fSF_BIAS()) >> 1) - 1; + RdV = fMAKESF(fGETBIT(31,RsV),exp,mant); + } +}) + +Q6INSN(F2_sffixupr,"Rd32=sffixupr(Rs32)",ATTRIBS(), +"Fix Up Radicand", +{ + fHIDE(int adjust;) + fSF_INVSQRT_COMMON(RsV,RdV,adjust); + RdV = RsV; +}) + +Q6INSN(F2_sfsqrtcheat,"Rd32=sfsqrtcheat(Rs32)",ATTRIBS(A_FAKEINSN), +"Cheating instruction for SQRT Testing", +{ + RdV = fUNFLOAT(sqrtf(fFLOAT(RsV))); +}) + +/*************************************/ +/* Scalar DP */ +/*************************************/ +Q6INSN(F2_dfadd,"Rdd32=dfadd(Rss32,Rtt32)",ATTRIBS(), +"Floating-Point Add", +{ RddV=fUNDOUBLE(fDOUBLE(RssV)+fDOUBLE(RttV));}) + +Q6INSN(F2_dfsub,"Rdd32=dfsub(Rss32,Rtt32)",ATTRIBS(), +"Floating-Point Subtract", +{ RddV=fUNDOUBLE(fDOUBLE(RssV)-fDOUBLE(RttV));}) + +Q6INSN(F2_dfmax,"Rdd32=dfmax(Rss32,Rtt32)",ATTRIBS(), +"Maximum of Floating-Point values", +{ RddV = fUNDOUBLE(fDF_MAX(fDOUBLE(RssV),fDOUBLE(RttV))); }) + +Q6INSN(F2_dfmin,"Rdd32=dfmin(Rss32,Rtt32)",ATTRIBS(), +"Minimum of Floating-Point values", +{ RddV = fUNDOUBLE(fDF_MIN(fDOUBLE(RssV),fDOUBLE(RttV))); }) + +Q6INSN(F2_dfmpyfix,"Rdd32=dfmpyfix(Rss32,Rtt32)",ATTRIBS(), +"Fix Up Multiplicand for Multiplication", +{ + if (fDF_ISDENORM(RssV) && fDF_ISBIG(RttV) && fDF_ISNORMAL(RttV)) RddV = fUNDOUBLE(fDOUBLE(RssV) * 0x1.0p52); + else if (fDF_ISDENORM(RttV) && fDF_ISBIG(RssV) && fDF_ISNORMAL(RssV)) RddV = fUNDOUBLE(fDOUBLE(RssV) * 0x1.0p-52); + else RddV = RssV; +}) + +Q6INSN(F2_dfmpyll,"Rdd32=dfmpyll(Rss32,Rtt32)",ATTRIBS(), +"Multiply low*low and shift off low 32 bits into sticky (in MSB)", +{ + fHIDE(size8u_t prod;) + prod = fMPY32UU(fGETUWORD(0,RssV),fGETUWORD(0,RttV)); + RddV = (prod >> 32) << 1; + if (fGETUWORD(0,prod) != 0) fSETBIT(0,RddV,1); +}) + +Q6INSN(F2_dfmpylh,"Rxx32+=dfmpylh(Rss32,Rtt32)",ATTRIBS(), +"Multiply low*high and accumulate", +{ + RxxV += (fGETUWORD(0,RssV) * (0x00100000 | fZXTN(20,64,fGETUWORD(1,RttV)))) << 1; +}) + +Q6INSN(F2_dfmpyhh,"Rxx32+=dfmpyhh(Rss32,Rtt32)",ATTRIBS(), +"Multiply high*high and accumulate with L*H value", +{ + RxxV = fUNDOUBLE(fDF_MPY_HH(fDOUBLE(RssV),fDOUBLE(RttV),RxxV)); +}) + + + +#ifdef ADD_DP_OPS +Q6INSN(F2_dfmpy,"Rdd32=dfmpy(Rss32,Rtt32)",ATTRIBS(A_FAKEINSN), +"Floating-Point Multiply", +{ FLATENCY(4);RddV=fUNDOUBLE(fDFMPY(fDOUBLE(RssV),fDOUBLE(RttV)));}) + +Q6INSN(F2_dffma,"Rxx32+=dfmpy(Rss32,Rtt32)",ATTRIBS(A_FAKEINSN), +"Floating-Point Fused Multiply Add", +{ FLATENCY(5);RxxV=fUNDOUBLE(fFMA(fDOUBLE(RssV),fDOUBLE(RttV),fDOUBLE(RxxV)));}) + +Q6INSN(F2_dffms,"Rxx32-=dfmpy(Rss32,Rtt32)",ATTRIBS(A_FAKEINSN), +"Floating-Point Fused Multiply Add", +{ FLATENCY(5);RxxV=fUNDOUBLE(fFMA(-fDOUBLE(RssV),fDOUBLE(RttV),fDOUBLE(RxxV)));}) + +Q6INSN(F2_dffma_lib,"Rxx32+=dfmpy(Rss32,Rtt32):lib",ATTRIBS(A_FAKEINSN), +"Floating-Point Fused Multiply Add for Library Routines", +{ FLATENCY(5);fFPSETROUND_NEAREST(); fHIDE(int infinp; int infminusinf;) + infinp = (isinf(fDOUBLE(RxxV))) || (isinf(fDOUBLE(RttV))) || (isinf(fDOUBLE(RssV))); + infminusinf = ((isinf(fDOUBLE(RxxV))) && + (fISINFPROD(fDOUBLE(RssV),fDOUBLE(RttV))) && + (fGETBIT(63,RssV ^ RxxV ^ RttV) != 0)); + fCHECKDFNAN3(RxxV,RxxV,RssV,RttV); + if ((fDOUBLE(RssV) != 0.0) && (fDOUBLE(RttV) != 0.0)) { + RxxV=fUNDOUBLE(fFMA(fDOUBLE(RssV),fDOUBLE(RttV),fDOUBLE(RxxV))); + } else { + if (isinf(fDOUBLE(RssV)) || isinf(fDOUBLE(RttV))) RxxV = fDFNANVAL(); + } + fFPCANCELFLAGS(); + if (isinf(fDOUBLE(RxxV)) && !infinp) RxxV = RxxV - 1; + if (infminusinf) RxxV = 0; +}) + + +Q6INSN(F2_dffms_lib,"Rxx32-=dfmpy(Rss32,Rtt32):lib",ATTRIBS(A_FAKEINSN), +"Floating-Point Fused Multiply Add for Library Routines", +{ FLATENCY(5); fFPSETROUND_NEAREST(); fHIDE(int infinp; int infminusinf;) + infinp = (isinf(fDOUBLE(RxxV))) || (isinf(fDOUBLE(RttV))) || (isinf(fDOUBLE(RssV))); + infminusinf = ((isinf(fDOUBLE(RxxV))) && + (fISINFPROD(fDOUBLE(RssV),fDOUBLE(RttV))) && + (fGETBIT(63,RssV ^ RxxV ^ RttV) == 0)); + fCHECKDFNAN3(RxxV,RxxV,RssV,RttV); + if ((fDOUBLE(RssV) != 0.0) && (fDOUBLE(RttV) != 0.0)) { + RxxV=fUNDOUBLE(fFMA(-fDOUBLE(RssV),fDOUBLE(RttV),fDOUBLE(RxxV))); + } else { + if (isinf(fDOUBLE(RssV)) || isinf(fDOUBLE(RttV))) RxxV = fDFNANVAL(); + } + fFPCANCELFLAGS(); + if (isinf(fDOUBLE(RxxV)) && !infinp) RxxV = RxxV - 1; + if (infminusinf) RxxV = 0; +}) + + +Q6INSN(F2_dffma_sc,"Rxx32+=dfmpy(Rss32,Rtt32,Pu4):scale",ATTRIBS(A_FAKEINSN), +"Floating-Point Fused Multiply Add w/ Additional Scaling (2**(4*Pu))", +{ FLATENCY(5); + fHIDE(size8s_t tmp;) + fCHECKDFNAN3(RxxV,RxxV,RssV,RttV); + tmp=fUNDOUBLE(fFMAX(fDOUBLE(RssV),fDOUBLE(RttV),fDOUBLE(RxxV),PuV)); + if ((fDOUBLE(tmp) != fDOUBLE(RxxV)) || ((fDOUBLE(RssV) != 0.0) && (fDOUBLE(RttV) != 0.0))) RxxV = tmp; +}) + +#endif + +Q6INSN(F2_dfcmpeq,"Pd4=dfcmp.eq(Rss32,Rtt32)",ATTRIBS(), +"Floating Point Compare for Equal", +{PdV=f8BITSOF(fDOUBLE(RssV)==fDOUBLE(RttV));}) + +Q6INSN(F2_dfcmpgt,"Pd4=dfcmp.gt(Rss32,Rtt32)",ATTRIBS(), +"Floating-Point Compare for Greater Than", +{PdV=f8BITSOF(fDOUBLE(RssV)>fDOUBLE(RttV));}) + + +/* cmpge is not the same as !cmpgt(swapops) in IEEE */ + +Q6INSN(F2_dfcmpge,"Pd4=dfcmp.ge(Rss32,Rtt32)",ATTRIBS(), +"Floating-Point Compare for Greater Than / Equal To", +{PdV=f8BITSOF(fDOUBLE(RssV)>=fDOUBLE(RttV));}) + +/* Everyone seems to have this... */ + +Q6INSN(F2_dfcmpuo,"Pd4=dfcmp.uo(Rss32,Rtt32)",ATTRIBS(), +"Floating-Point Compare for Unordered", +{PdV=f8BITSOF(isunordered(fDOUBLE(RssV),fDOUBLE(RttV)));}) + + +Q6INSN(F2_dfclass,"Pd4=dfclass(Rss32,#u5)",ATTRIBS(), +"Classify Floating-Point Value", +{ + fHIDE(int class;) + PdV = 0; + class = fpclassify(fDOUBLE(RssV)); + /* Is the value zero? */ + if (fGETBIT(0,uiV) && (class == FP_ZERO)) PdV = 0xff; + if (fGETBIT(1,uiV) && (class == FP_NORMAL)) PdV = 0xff; + if (fGETBIT(2,uiV) && (class == FP_SUBNORMAL)) PdV = 0xff; + if (fGETBIT(3,uiV) && (class == FP_INFINITE)) PdV = 0xff; + if (fGETBIT(4,uiV) && (class == FP_NAN)) PdV = 0xff; + fFPCANCELFLAGS(); +}) + + +/* Range: +/- (1.0 .. 1+(63/64)) * 2**(-6 .. +9) */ +/* More immediate bits should probably be used for more precision? */ + +Q6INSN(F2_dfimm_p,"Rdd32=dfmake(#u10):pos",ATTRIBS(A_DFMAKE), +"Make Floating Point Value", +{ + RddV = (1023ULL - 6) << 52; + RddV += (fHIDE((size8u_t))uiV) << 46; +}) + +Q6INSN(F2_dfimm_n,"Rdd32=dfmake(#u10):neg",ATTRIBS(A_DFMAKE), +"Make Floating Point Value", +{ + RddV = (1023ULL - 6) << 52; + RddV += (fHIDE((size8u_t))uiV) << 46; + RddV |= ((1ULL) << 63); +}) + + +#ifdef ADD_DP_OPS +Q6INSN(F2_dfdivcheat,"Rdd32=dfdivcheat(Rss32,Rtt32)",ATTRIBS(A_FAKEINSN), +"Cheating Instruction for Divide Testing", +{ + RddV = fUNDOUBLE(fDOUBLE(RssV) / fDOUBLE(RttV)); +}) + + +Q6INSN(F2_dfrecipa,"Rdd32,Pe4=dfrecipa(Rss32,Rtt32)",ATTRIBS(A_NOTE_COMPAT_ACCURACY,A_FAKEINSN), +"Reciprocal Approximation for Division", +{ + fHIDE(int idx;) + fHIDE(int adjust;) + fHIDE(size8s_t mant;) + fHIDE(int exp;) + if (fDF_RECIP_COMMON(RssV,RttV,RddV,adjust)) { + PeV = adjust; + idx = (RttV >> 45) & 0x7f; + mant = (fDF_RECIP_LOOKUP(idx) << 44) | 1; + exp = fDF_BIAS() - (fDF_GETEXP(RttV) - fDF_BIAS()) - 1; + RddV = fMAKEDF(fGETBIT(63,RttV),exp,mant); + } +}) + +Q6INSN(F2_dffixupn,"Rdd32=dffixupn(Rss32,Rtt32)",ATTRIBS(A_FAKEINSN), +"Fix Up Numerator", +{ + fHIDE(int adjust;) + fDF_RECIP_COMMON(RssV,RttV,RddV,adjust); + RddV = RssV; +}) + +Q6INSN(F2_dffixupd,"Rdd32=dffixupd(Rss32,Rtt32)",ATTRIBS(A_FAKEINSN), +"Fix Up Denominator", +{ + fHIDE(int adjust;) + fDF_RECIP_COMMON(RssV,RttV,RddV,adjust); + RddV = RttV; +}) + + + + +Q6INSN(F2_dfinvsqrta,"Rdd32,Pe4=dfinvsqrta(Rss32)",ATTRIBS(A_NOTE_COMPAT_ACCURACY,A_FAKEINSN), +"Reciprocal Approximation for Division", +{ + fHIDE(int idx;) + fHIDE(int adjust;) + fHIDE(size8u_t mant;) + fHIDE(int exp;) + if (fDF_INVSQRT_COMMON(RssV,RddV,adjust)) { + PeV = adjust; + idx = (RssV >> 46) & 0x7f; + mant = (fDF_INVSQRT_LOOKUP(idx) << 44); + exp = fDF_BIAS() - ((fDF_GETEXP(RssV) - fDF_BIAS()) >> 1) - 1; + RddV = fMAKEDF(fGETBIT(63,RssV),exp,mant); + } +}) + +Q6INSN(F2_dffixupr,"Rdd32=dffixupr(Rss32)",ATTRIBS(A_FAKEINSN), +"Fix Up Radicand", +{ + fHIDE(int adjust;) + fDF_INVSQRT_COMMON(RssV,RddV,adjust); + RddV = RssV; +}) + +Q6INSN(F2_dfsqrtcheat,"Rdd32=dfsqrtcheat(Rss32)",ATTRIBS(A_FAKEINSN), +"Cheating instruction for SQRT Testing", +{ + RddV = fUNDOUBLE(sqrt(fDOUBLE(RssV))); +}) +#endif + + + + + +/* CONVERSION */ + +#define CONVERT(TAG,DEST,DESTV,SRC,SRCV,OUTCAST,OUTTYPE,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \ + Q6INSN(F2_conv_##TAG##MODETAG,#DEST"=convert_"#TAG"("#SRC")"#MODESYN,ATTRIBS(), \ + "Floating point format conversion", \ + { MODEBEH DESTV = OUTCAST(conv_##INTYPE##_to_##OUTTYPE(INCAST(SRCV))); }) + +CONVERT(sf2df,Rdd32,RddV,Rs32,RsV,fUNDOUBLE,df,fFLOAT,sf,,,) +CONVERT(df2sf,Rd32,RdV,Rss32,RssV,fUNFLOAT,sf,fDOUBLE,df,,,) + +#define ALLINTDST(TAGSTART,SRC,SRCV,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \ +CONVERT(TAGSTART##uw,Rd32,RdV,SRC,SRCV,fCAST4u,4u,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \ +CONVERT(TAGSTART##w,Rd32,RdV,SRC,SRCV,fCAST4s,4s,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \ +CONVERT(TAGSTART##ud,Rdd32,RddV,SRC,SRCV,fCAST8u,8u,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \ +CONVERT(TAGSTART##d,Rdd32,RddV,SRC,SRCV,fCAST8s,8s,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) + +#define ALLFPDST(TAGSTART,SRC,SRCV,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \ +CONVERT(TAGSTART##sf,Rd32,RdV,SRC,SRCV,fUNFLOAT,sf,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) \ +CONVERT(TAGSTART##df,Rdd32,RddV,SRC,SRCV,fUNDOUBLE,df,INCAST,INTYPE,MODETAG,MODESYN,MODEBEH) + +#define ALLINTSRC(GEN,MODETAG,MODESYN,MODEBEH) \ +GEN(uw##2,Rs32,RsV,fCAST4u,4u,MODETAG,MODESYN,MODEBEH) \ +GEN(w##2,Rs32,RsV,fCAST4s,4s,MODETAG,MODESYN,MODEBEH) \ +GEN(ud##2,Rss32,RssV,fCAST8u,8u,MODETAG,MODESYN,MODEBEH) \ +GEN(d##2,Rss32,RssV,fCAST8s,8s,MODETAG,MODESYN,MODEBEH) + +#define ALLFPSRC(GEN,MODETAG,MODESYN,MODEBEH) \ +GEN(sf##2,Rs32,RsV,fFLOAT,sf,MODETAG,MODESYN,MODEBEH) \ +GEN(df##2,Rss32,RssV,fDOUBLE,df,MODETAG,MODESYN,MODEBEH) + +ALLINTSRC(ALLFPDST,,,) +ALLFPSRC(ALLINTDST,,,) +ALLFPSRC(ALLINTDST,_chop,:chop,fFPSETROUND_CHOP();) + diff --git a/target/hexagon/imported/ldst.idef b/target/hexagon/imported/ldst.idef new file mode 100644 index 0000000..3153438 --- /dev/null +++ b/target/hexagon/imported/ldst.idef @@ -0,0 +1,421 @@ +/* + * Copyright (c) 2019 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/* + * Load and Store instruction definitions + */ + +#define FRAME_EXPLICIT 1 + +/* The set of addressing modes standard to all Load instructions */ +#define STD_LD_AMODES(TAG,OPER,DESCR,ATTRIB,SHFT,SEMANTICS,SCALE)\ +Q6INSN(L2_##TAG##_io, OPER"(Rs32+#s11:"SHFT")", ATTRIB,DESCR,{fIMMEXT(siV); fEA_RI(RsV,siV); SEMANTICS; })\ +Q6INSN(L4_##TAG##_ur, OPER"(Rt32<<#u2+#U6)", ATTRIB,DESCR,{fMUST_IMMEXT(UiV); fEA_IRs(UiV,RtV,uiV); SEMANTICS;})\ +Q6INSN(L4_##TAG##_ap, OPER"(Re32=#U6)", ATTRIB,DESCR,{fMUST_IMMEXT(UiV); fEA_IMM(UiV); SEMANTICS; ReV=UiV; })\ +Q6INSN(L2_##TAG##_pr, OPER"(Rx32++Mu2)", ATTRIB,DESCR,{fEA_REG(RxV); fPM_M(RxV,MuV); SEMANTICS;})\ +Q6INSN(L2_##TAG##_pbr, OPER"(Rx32++Mu2:brev)", ATTRIB,DESCR,{fEA_BREVR(RxV); fPM_M(RxV,MuV); SEMANTICS;})\ +Q6INSN(L2_##TAG##_pi, OPER"(Rx32++#s4:"SHFT")", ATTRIB,DESCR,{fEA_REG(RxV); fPM_I(RxV,siV); SEMANTICS;})\ +Q6INSN(L2_##TAG##_pci, OPER"(Rx32++#s4:"SHFT":circ(Mu2))",ATTRIB,DESCR,{fEA_REG(RxV); fPM_CIRI(RxV,siV,MuV); SEMANTICS;})\ +Q6INSN(L2_##TAG##_pcr, OPER"(Rx32++I:circ(Mu2))", ATTRIB,DESCR,{fEA_REG(RxV); fPM_CIRR(RxV,fREAD_IREG(MuV)<>16)|(tmpV<<48); +},1) + + +STD_LD_AMODES(loadalignb, "Ryy32=memb_fifo", "Load byte into shifted vector", +ATTRIBS(A_MEMSIZE_1B,A_LOAD),"0", +{ + fHIDE(size8u_t tmpV;) + fLOAD(1,1,u,EA,tmpV); + RyyV = (((size8u_t)RyyV)>>8)|(tmpV<<56); +},0) + + + + +/* The set of addressing modes standard to all Store instructions */ +#define STD_ST_AMODES(TAG,DEST,OPER,DESCR,ATTRIB,SHFT,SEMANTICS,SCALE)\ +Q6INSN(S2_##TAG##_io, OPER"(Rs32+#s11:"SHFT")="DEST, ATTRIB,DESCR,{fIMMEXT(siV); fEA_RI(RsV,siV); SEMANTICS; })\ +Q6INSN(S2_##TAG##_pi, OPER"(Rx32++#s4:"SHFT")="DEST, ATTRIB,DESCR,{fEA_REG(RxV); fPM_I(RxV,siV); SEMANTICS; })\ +Q6INSN(S4_##TAG##_ap, OPER"(Re32=#U6)="DEST, ATTRIB,DESCR,{fMUST_IMMEXT(UiV); fEA_IMM(UiV); SEMANTICS; ReV=UiV; })\ +Q6INSN(S2_##TAG##_pr, OPER"(Rx32++Mu2)="DEST, ATTRIB,DESCR,{fEA_REG(RxV); fPM_M(RxV,MuV); SEMANTICS; })\ +Q6INSN(S4_##TAG##_ur, OPER"(Ru32<<#u2+#U6)="DEST, ATTRIB,DESCR,{fMUST_IMMEXT(UiV); fEA_IRs(UiV,RuV,uiV); SEMANTICS;})\ +Q6INSN(S2_##TAG##_pbr, OPER"(Rx32++Mu2:brev)="DEST, ATTRIB,DESCR,{fEA_BREVR(RxV); fPM_M(RxV,MuV); SEMANTICS; })\ +Q6INSN(S2_##TAG##_pci, OPER"(Rx32++#s4:"SHFT":circ(Mu2))="DEST, ATTRIB,DESCR,{fEA_REG(RxV); fPM_CIRI(RxV,siV,MuV); SEMANTICS;})\ +Q6INSN(S2_##TAG##_pcr, OPER"(Rx32++I:circ(Mu2))="DEST, ATTRIB,DESCR,{fEA_REG(RxV); fPM_CIRR(RxV,fREAD_IREG(MuV)<. + */ + +/* + * Multiply Instructions + */ + + +#define STD_SP_MODES(TAG,OPER,ATR,DST,ACCSEM,SEM,OSEM,SATSEM,RNDSEM,ACCSYN)\ +Q6INSN(M2_##TAG##_hh_s0, OPER"(Rs.H32,Rt.H32)"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM( fGETHALF(1,RsV),fGETHALF(1,RtV)))); ACCSYN;})\ +Q6INSN(M2_##TAG##_hh_s1, OPER"(Rs.H32,Rt.H32):<<1"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETHALF(1,RsV),fGETHALF(1,RtV))))); ACCSYN;})\ +Q6INSN(M2_##TAG##_hl_s0, OPER"(Rs.H32,Rt.L32)"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM( fGETHALF(1,RsV),fGETHALF(0,RtV)))); ACCSYN;})\ +Q6INSN(M2_##TAG##_hl_s1, OPER"(Rs.H32,Rt.L32):<<1"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETHALF(1,RsV),fGETHALF(0,RtV))))); ACCSYN;})\ +Q6INSN(M2_##TAG##_lh_s0, OPER"(Rs.L32,Rt.H32)"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM( fGETHALF(0,RsV),fGETHALF(1,RtV)))); ACCSYN;})\ +Q6INSN(M2_##TAG##_lh_s1, OPER"(Rs.L32,Rt.H32):<<1"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETHALF(0,RsV),fGETHALF(1,RtV))))); ACCSYN;})\ +Q6INSN(M2_##TAG##_ll_s0, OPER"(Rs.L32,Rt.L32)"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM( fGETHALF(0,RsV),fGETHALF(0,RtV)))); ACCSYN;})\ +Q6INSN(M2_##TAG##_ll_s1, OPER"(Rs.L32,Rt.L32):<<1"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETHALF(0,RsV),fGETHALF(0,RtV))))); ACCSYN;}) + +/*****************************************************/ +/* multiply 16x16->32 signed instructions */ +/*****************************************************/ +STD_SP_MODES(mpy_acc, "Rx32+=mpy", ,RxV,RxV+ ,fMPY16SS, ,fPASS,fPASS,fACC()) +STD_SP_MODES(mpy_nac, "Rx32-=mpy", ,RxV,RxV- ,fMPY16SS, ,fPASS,fPASS,fACC()) +STD_SP_MODES(mpy_acc_sat,"Rx32+=mpy", ,RxV,RxV+ ,fMPY16SS,":sat" ,fSAT, fPASS,fACC()) +STD_SP_MODES(mpy_nac_sat,"Rx32-=mpy", ,RxV,RxV- ,fMPY16SS,":sat" ,fSAT, fPASS,fACC()) +STD_SP_MODES(mpy, "Rd32=mpy", ,RdV, ,fMPY16SS, ,fPASS,fPASS,) +STD_SP_MODES(mpy_sat, "Rd32=mpy", ,RdV, ,fMPY16SS,":sat" ,fSAT, fPASS,) +STD_SP_MODES(mpy_rnd, "Rd32=mpy", ,RdV, ,fMPY16SS,":rnd" ,fPASS,fROUND,) +STD_SP_MODES(mpy_sat_rnd,"Rd32=mpy", ,RdV, ,fMPY16SS,":rnd:sat",fSAT, fROUND,) +STD_SP_MODES(mpyd_acc, "Rxx32+=mpy",,RxxV,RxxV+ ,fMPY16SS, ,fPASS,fPASS,fACC()) +STD_SP_MODES(mpyd_nac, "Rxx32-=mpy",,RxxV,RxxV- ,fMPY16SS, ,fPASS,fPASS,fACC()) +STD_SP_MODES(mpyd, "Rdd32=mpy", ,RddV, ,fMPY16SS, ,fPASS,fPASS,) +STD_SP_MODES(mpyd_rnd, "Rdd32=mpy", ,RddV, ,fMPY16SS,":rnd" ,fPASS,fROUND,) + + +/*****************************************************/ +/* multiply 16x16->32 unsigned instructions */ +/*****************************************************/ +#define STD_USP_MODES(TAG,OPER,ATR,DST,ACCSEM,SEM,OSEM,SATSEM,RNDSEM,ACCSYN)\ +Q6INSN(M2_##TAG##_hh_s0, OPER"(Rs.H32,Rt.H32)"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM( fGETUHALF(1,RsV),fGETUHALF(1,RtV)))); ACCSYN;})\ +Q6INSN(M2_##TAG##_hh_s1, OPER"(Rs.H32,Rt.H32):<<1"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETUHALF(1,RsV),fGETUHALF(1,RtV))))); ACCSYN;})\ +Q6INSN(M2_##TAG##_hl_s0, OPER"(Rs.H32,Rt.L32)"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM( fGETUHALF(1,RsV),fGETUHALF(0,RtV)))); ACCSYN;})\ +Q6INSN(M2_##TAG##_hl_s1, OPER"(Rs.H32,Rt.L32):<<1"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETUHALF(1,RsV),fGETUHALF(0,RtV))))); ACCSYN;})\ +Q6INSN(M2_##TAG##_lh_s0, OPER"(Rs.L32,Rt.H32)"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM( fGETUHALF(0,RsV),fGETUHALF(1,RtV))));} ACCSYN;)\ +Q6INSN(M2_##TAG##_lh_s1, OPER"(Rs.L32,Rt.H32):<<1"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETUHALF(0,RsV),fGETUHALF(1,RtV))))); ACCSYN;})\ +Q6INSN(M2_##TAG##_ll_s0, OPER"(Rs.L32,Rt.L32)"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM SEM( fGETUHALF(0,RsV),fGETUHALF(0,RtV)))); ACCSYN;})\ +Q6INSN(M2_##TAG##_ll_s1, OPER"(Rs.L32,Rt.L32):<<1"OSEM, ATR,"",{DST=SATSEM(RNDSEM(ACCSEM fSCALE(1,SEM(fGETUHALF(0,RsV),fGETUHALF(0,RtV))))); ACCSYN;}) + +STD_USP_MODES(mpyu_acc, "Rx32+=mpyu", ,RxV,RxV+ ,fMPY16UU, ,fPASS,fPASS,fACC()) +STD_USP_MODES(mpyu_nac, "Rx32-=mpyu", ,RxV,RxV- ,fMPY16UU, ,fPASS,fPASS,fACC()) +STD_USP_MODES(mpyu, "Rd32=mpyu", ATTRIBS(A_INTRINSIC_RETURNS_UNSIGNED) ,RdV, ,fMPY16UU, ,fPASS,fPASS,) +STD_USP_MODES(mpyud_acc, "Rxx32+=mpyu",,RxxV,RxxV+,fMPY16UU, ,fPASS,fPASS,fACC()) +STD_USP_MODES(mpyud_nac, "Rxx32-=mpyu",,RxxV,RxxV-,fMPY16UU, ,fPASS,fPASS,fACC()) +STD_USP_MODES(mpyud, "Rdd32=mpyu", ATTRIBS(A_INTRINSIC_RETURNS_UNSIGNED) ,RddV, ,fMPY16UU, ,fPASS,fPASS,) + +/**********************************************/ +/* mpy 16x#s8->32 */ +/**********************************************/ + +Q6INSN(M2_mpysip,"Rd32=+mpyi(Rs32,#u8)",ATTRIBS(A_ARCHV2,A_ROPS_2,A_MPY), +"32-bit Multiply by unsigned immediate", +{ fIMMEXT(uiV); RdV=RsV*uiV; }) + +Q6INSN(M2_mpysin,"Rd32=-mpyi(Rs32,#u8)",ATTRIBS(A_ARCHV2,A_ROPS_2,A_MPY), +"32-bit Multiply by unsigned immediate, negate result", +{ RdV=RsV*-uiV; }) + +DEF_V2_COND_MAPPING(M2_mpysmi,"Rd32=mpyi(Rs32,#m9)","((#m9<0) && (#m9>-256))","Rd32=-mpyi(Rs32,#m9*(-1))","Rd32=+mpyi(Rs32,#m9)") + +Q6INSN(M2_macsip,"Rx32+=mpyi(Rs32,#u8)",ATTRIBS(A_ARCHV2,A_ROPS_2,A_MPY), +"32-bit Multiply-Add by unsigned immediate", +{ fIMMEXT(uiV); RxV=RxV + (RsV*uiV); fACC();}) + +Q6INSN(M2_macsin,"Rx32-=mpyi(Rs32,#u8)",ATTRIBS(A_ARCHV2,A_ROPS_2,A_MPY), +"32-bit Multiply-Subtract by unsigned immediate", +{ fIMMEXT(uiV); RxV=RxV - (RsV*uiV); fACC();}) + + +/**********************************************/ +/* multiply/mac 32x32->64 instructions */ +/**********************************************/ +Q6INSN(M2_dpmpyss_s0, "Rdd32=mpy(Rs32,Rt32)", ATTRIBS(A_IT_MPY,A_IT_MPY_32),"Multiply 32x32",{RddV=fMPY32SS(RsV,RtV);}) +Q6INSN(M2_dpmpyss_acc_s0,"Rxx32+=mpy(Rs32,Rt32)",ATTRIBS(A_IT_MPY,A_IT_MPY_32,A_ROPS_2),"Multiply 32x32",{RxxV= RxxV + fMPY32SS(RsV,RtV); fACC();}) +Q6INSN(M2_dpmpyss_nac_s0,"Rxx32-=mpy(Rs32,Rt32)",ATTRIBS(A_IT_MPY,A_IT_MPY_32,A_ROPS_2),"Multiply 32x32",{RxxV= RxxV - fMPY32SS(RsV,RtV); fACC();}) + +Q6INSN(M2_dpmpyuu_s0, "Rdd32=mpyu(Rs32,Rt32)", ATTRIBS(A_IT_MPY,A_IT_MPY_32,A_INTRINSIC_RETURNS_UNSIGNED),"Multiply 32x32",{RddV=fMPY32UU(fCAST4u(RsV),fCAST4u(RtV));}) +Q6INSN(M2_dpmpyuu_acc_s0,"Rxx32+=mpyu(Rs32,Rt32)",ATTRIBS(A_IT_MPY,A_IT_MPY_32,A_ROPS_2),"Multiply 32x32",{RxxV= RxxV + fMPY32UU(fCAST4u(RsV),fCAST4u(RtV)); fACC();}) +Q6INSN(M2_dpmpyuu_nac_s0,"Rxx32-=mpyu(Rs32,Rt32)",ATTRIBS(A_IT_MPY,A_IT_MPY_32,A_ROPS_2),"Multiply 32x32",{RxxV= RxxV - fMPY32UU(fCAST4u(RsV),fCAST4u(RtV)); fACC();}) + + +/******************************************************/ +/* multiply/mac 32x32->32 (upper) instructions */ +/******************************************************/ +Q6INSN(M2_mpy_up, "Rd32=mpy(Rs32,Rt32)", ATTRIBS(A_IT_MPY,A_IT_MPY_32),"Multiply 32x32",{RdV=fMPY32SS(RsV,RtV)>>32;}) +Q6INSN(M2_mpy_up_s1, "Rd32=mpy(Rs32,Rt32):<<1", ATTRIBS(A_IT_MPY,A_IT_MPY_32),"Multiply 32x32",{RdV=fMPY32SS(RsV,RtV)>>31;}) +Q6INSN(M2_mpy_up_s1_sat, "Rd32=mpy(Rs32,Rt32):<<1:sat", ATTRIBS(A_IT_MPY,A_IT_MPY_32),"Multiply 32x32",{RdV=fSAT(fMPY32SS(RsV,RtV)>>31);}) +Q6INSN(M2_mpyu_up, "Rd32=mpyu(Rs32,Rt32)", ATTRIBS(A_IT_MPY,A_IT_MPY_32,A_INTRINSIC_RETURNS_UNSIGNED),"Multiply 32x32",{RdV=fMPY32UU(fCAST4u(RsV),fCAST4u(RtV))>>32;}) +Q6INSN(M2_mpysu_up, "Rd32=mpysu(Rs32,Rt32)", ATTRIBS(A_IT_MPY,A_IT_MPY_32),"Multiply 32x32",{RdV=fMPY32SU(RsV,fCAST4u(RtV))>>32;}) +Q6INSN(M2_dpmpyss_rnd_s0,"Rd32=mpy(Rs32,Rt32):rnd", ATTRIBS(A_IT_MPY,A_IT_MPY_32),"Multiply 32x32",{RdV=(fMPY32SS(RsV,RtV)+fCONSTLL(0x80000000))>>32;}) + +Q6INSN(M4_mac_up_s1_sat, "Rx32+=mpy(Rs32,Rt32):<<1:sat", ATTRIBS(A_IT_MPY,A_IT_MPY_32),"Multiply 32x32",{RxV=fSAT( (fSE32_64(RxV)) + (fMPY32SS(RsV,RtV)>>31)); fACC();}) +Q6INSN(M4_nac_up_s1_sat, "Rx32-=mpy(Rs32,Rt32):<<1:sat", ATTRIBS(A_IT_MPY,A_IT_MPY_32),"Multiply 32x32",{RxV=fSAT( (fSE32_64(RxV)) - (fMPY32SS(RsV,RtV)>>31)); fACC();}) + + +/**********************************************/ +/* 32x32->32 multiply (lower) */ +/**********************************************/ + +Q6INSN(M2_mpyi,"Rd32=mpyi(Rs32,Rt32)",ATTRIBS(A_IT_MPY,A_IT_MPY_32,A_MPY), +"Multiply Integer", +{ RdV=RsV*RtV;}) + + + +DEF_MAPPING(M2_mpyui,"Rd32=mpyui(Rs32,Rt32)","Rd32=mpyi(Rs32,Rt32)") + + +Q6INSN(M2_maci,"Rx32+=mpyi(Rs32,Rt32)",ATTRIBS(A_IT_MPY,A_IT_MPY_32,A_ARCHV2,A_ROPS_2,A_MPY), +"Multiply-Accumulate Integer", +{ RxV=RxV + RsV*RtV; fACC();}) + +Q6INSN(M2_mnaci,"Rx32-=mpyi(Rs32,Rt32)",ATTRIBS(A_IT_MPY,A_IT_MPY_32,A_ARCHV2,A_ROPS_2,A_MPY), +"Multiply-Neg-Accumulate Integer", +{ RxV=RxV - RsV*RtV; fACC();}) + +/****** WHY ARE THESE IN MPY.IDEF? **********/ + +Q6INSN(M2_acci,"Rx32+=add(Rs32,Rt32)",ATTRIBS(A_ROPS_2,A_ARCHV2), +"Add with accumulate", +{ RxV=RxV + RsV + RtV;}) + +Q6INSN(M2_accii,"Rx32+=add(Rs32,#s8)",ATTRIBS(A_ROPS_2,A_ARCHV2), +"Add with accumulate", +{ fIMMEXT(siV); RxV=RxV + RsV + siV;}) + +Q6INSN(M2_nacci,"Rx32-=add(Rs32,Rt32)",ATTRIBS(A_ROPS_2,A_ARCHV2), +"Add with neg accumulate", +{ RxV=RxV - (RsV + RtV);}) + +Q6INSN(M2_naccii,"Rx32-=add(Rs32,#s8)",ATTRIBS(A_ROPS_2,A_ARCHV2), +"Add with neg accumulate", +{ fIMMEXT(siV); RxV=RxV - (RsV + siV);}) + +Q6INSN(M2_subacc,"Rx32+=sub(Rt32,Rs32)",ATTRIBS(A_ROPS_2,A_ARCHV2), +"Sub with accumulate", +{ RxV=RxV + RtV - RsV;}) + + + + +Q6INSN(M4_mpyrr_addr,"Ry32=add(Ru32,mpyi(Ry32,Rs32))",ATTRIBS(A_ROPS_2,A_MPY), +"Mpy by immed and add immed", +{ RyV = RuV + RsV*RyV; fACC();}) + +Q6INSN(M4_mpyri_addr_u2,"Rd32=add(Ru32,mpyi(#u6:2,Rs32))",ATTRIBS(A_ROPS_2,A_MPY), +"Mpy by immed and add immed", +{ RdV = RuV + RsV*uiV; fACC();}) + +Q6INSN(M4_mpyri_addr,"Rd32=add(Ru32,mpyi(Rs32,#u6))",ATTRIBS(A_ROPS_2,A_MPY), +"Mpy by immed and add immed", +{ fIMMEXT(uiV); RdV = RuV + RsV*uiV; fACC();}) + + + +Q6INSN(M4_mpyri_addi,"Rd32=add(#u6,mpyi(Rs32,#U6))",ATTRIBS(A_ROPS_2,A_MPY), +"Mpy by immed and add immed", +{ fIMMEXT(uiV); RdV = uiV + RsV*UiV;}) + + + +Q6INSN(M4_mpyrr_addi,"Rd32=add(#u6,mpyi(Rs32,Rt32))",ATTRIBS(A_ROPS_2,A_MPY), +"Mpy by immed and add immed", +{ fIMMEXT(uiV); RdV = uiV + RsV*RtV;}) + + + + + + + + + + + + + + + + + +/**********************************************/ +/* vector mac 2x[16x16 -> 32] */ +/**********************************************/ + +#undef vmac_sema +#define vmac_sema(N)\ +{ fSETWORD(0,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV)))));\ + fSETWORD(1,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV)))));\ +} +Q6INSN(M2_vmpy2s_s0,"Rdd32=vmpyh(Rs32,Rt32):sat",ATTRIBS(),"Vector Multiply",vmac_sema(0)) +Q6INSN(M2_vmpy2s_s1,"Rdd32=vmpyh(Rs32,Rt32):<<1:sat",ATTRIBS(),"Vector Multiply",vmac_sema(1)) + + +#undef vmac_sema +#define vmac_sema(N)\ +{ fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV)))));\ + fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV)))));\ + fACC();\ +} +Q6INSN(M2_vmac2s_s0,"Rxx32+=vmpyh(Rs32,Rt32):sat",ATTRIBS(),"Vector Multiply",vmac_sema(0)) +Q6INSN(M2_vmac2s_s1,"Rxx32+=vmpyh(Rs32,Rt32):<<1:sat",ATTRIBS(),"Vector Multiply",vmac_sema(1)) + +#undef vmac_sema +#define vmac_sema(N)\ +{ fSETWORD(0,RddV,fSAT(fSCALE(N,fMPY16SU(fGETHALF(0,RsV),fGETUHALF(0,RtV)))));\ + fSETWORD(1,RddV,fSAT(fSCALE(N,fMPY16SU(fGETHALF(1,RsV),fGETUHALF(1,RtV)))));\ +} +Q6INSN(M2_vmpy2su_s0,"Rdd32=vmpyhsu(Rs32,Rt32):sat",ATTRIBS(),"Vector Multiply",vmac_sema(0)) +Q6INSN(M2_vmpy2su_s1,"Rdd32=vmpyhsu(Rs32,Rt32):<<1:sat",ATTRIBS(),"Vector Multiply",vmac_sema(1)) + + +#undef vmac_sema +#define vmac_sema(N)\ +{ fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + fSCALE(N,fMPY16SU(fGETHALF(0,RsV),fGETUHALF(0,RtV)))));\ + fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + fSCALE(N,fMPY16SU(fGETHALF(1,RsV),fGETUHALF(1,RtV)))));\ + fACC();\ +} +Q6INSN(M2_vmac2su_s0,"Rxx32+=vmpyhsu(Rs32,Rt32):sat",ATTRIBS(),"Vector Multiply",vmac_sema(0)) +Q6INSN(M2_vmac2su_s1,"Rxx32+=vmpyhsu(Rs32,Rt32):<<1:sat",ATTRIBS(),"Vector Multiply",vmac_sema(1)) + + + +#undef vmac_sema +#define vmac_sema(N)\ +{ fSETHALF(1,RdV,fGETHALF(1,(fSAT(fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV))) + 0x8000))));\ + fSETHALF(0,RdV,fGETHALF(1,(fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV))) + 0x8000))));\ +} +Q6INSN(M2_vmpy2s_s0pack,"Rd32=vmpyh(Rs32,Rt32):rnd:sat",ATTRIBS(A_ARCHV2),"Vector Multiply",vmac_sema(0)) +Q6INSN(M2_vmpy2s_s1pack,"Rd32=vmpyh(Rs32,Rt32):<<1:rnd:sat",ATTRIBS(A_ARCHV2),"Vector Multiply",vmac_sema(1)) + + +#undef vmac_sema +#define vmac_sema(N)\ +{ fSETWORD(0,RxxV,fGETWORD(0,RxxV) + fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV)));\ + fSETWORD(1,RxxV,fGETWORD(1,RxxV) + fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV)));\ + fACC();\ +} +Q6INSN(M2_vmac2,"Rxx32+=vmpyh(Rs32,Rt32)",ATTRIBS(A_ARCHV2),"Vector Multiply",vmac_sema(0)) + +#undef vmac_sema +#define vmac_sema(N)\ +{ fSETWORD(0,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)))));\ + fSETWORD(1,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)))));\ +} +Q6INSN(M2_vmpy2es_s0,"Rdd32=vmpyeh(Rss32,Rtt32):sat",ATTRIBS(),"Vector Multiply",vmac_sema(0)) +Q6INSN(M2_vmpy2es_s1,"Rdd32=vmpyeh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Vector Multiply",vmac_sema(1)) + +#undef vmac_sema +#define vmac_sema(N)\ +{ fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)))));\ + fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)))));\ + fACC();\ +} +Q6INSN(M2_vmac2es_s0,"Rxx32+=vmpyeh(Rss32,Rtt32):sat",ATTRIBS(),"Vector Multiply",vmac_sema(0)) +Q6INSN(M2_vmac2es_s1,"Rxx32+=vmpyeh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Vector Multiply",vmac_sema(1)) + +#undef vmac_sema +#define vmac_sema(N)\ +{ fSETWORD(0,RxxV,fGETWORD(0,RxxV) + fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)));\ + fSETWORD(1,RxxV,fGETWORD(1,RxxV) + fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)));\ + fACC();\ +} +Q6INSN(M2_vmac2es,"Rxx32+=vmpyeh(Rss32,Rtt32)",ATTRIBS(A_ARCHV2),"Vector Multiply",vmac_sema(0)) + + + + +/********************************************************/ +/* vrmpyh, aka Big Mac, aka Mac Daddy, aka Mac-ac-ac-ac */ +/* vector mac 4x[16x16] + 64 ->64 */ +/********************************************************/ + + +#undef vmac_sema +#define vmac_sema(N)\ +{ RxxV = RxxV + fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV))\ + + fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV))\ + + fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV))\ + + fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV));\ + fACC();\ +} +Q6INSN(M2_vrmac_s0,"Rxx32+=vrmpyh(Rss32,Rtt32)",ATTRIBS(),"Vector Multiply",vmac_sema(0)) + +#undef vmac_sema +#define vmac_sema(N)\ +{ RddV = fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV))\ + + fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV))\ + + fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV))\ + + fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV));\ +} +Q6INSN(M2_vrmpy_s0,"Rdd32=vrmpyh(Rss32,Rtt32)",ATTRIBS(),"Vector Multiply",vmac_sema(0)) + + + +/******************************************************/ +/* vector dual macs. just like complex */ +/******************************************************/ + + +/* With round&pack */ +#undef dmpy_sema +#define dmpy_sema(N)\ +{ fSETHALF(0,RdV,fGETHALF(1,(fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV))) + \ + fSCALE(N,fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV))) + 0x8000))));\ + fSETHALF(1,RdV,fGETHALF(1,(fSAT(fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV))) + \ + fSCALE(N,fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV))) + 0x8000))));\ +} +Q6INSN(M2_vdmpyrs_s0,"Rd32=vdmpy(Rss32,Rtt32):rnd:sat",ATTRIBS(), "vector dual mac w/ round&pack",dmpy_sema(0)) +Q6INSN(M2_vdmpyrs_s1,"Rd32=vdmpy(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"vector dual mac w/ round&pack",dmpy_sema(1)) + + + + + +/******************************************************/ +/* vector byte multiplies */ +/******************************************************/ + + +Q6INSN(M5_vrmpybuu,"Rdd32=vrmpybu(Rss32,Rtt32)",ATTRIBS(), + "vector dual mpy bytes", +{ + fSETWORD(0,RddV,(fMPY16SS(fGETUBYTE(0,RssV),fGETUBYTE(0,RttV)) + + fMPY16SS(fGETUBYTE(1,RssV),fGETUBYTE(1,RttV)) + + fMPY16SS(fGETUBYTE(2,RssV),fGETUBYTE(2,RttV)) + + fMPY16SS(fGETUBYTE(3,RssV),fGETUBYTE(3,RttV)))); + fSETWORD(1,RddV,(fMPY16SS(fGETUBYTE(4,RssV),fGETUBYTE(4,RttV)) + + fMPY16SS(fGETUBYTE(5,RssV),fGETUBYTE(5,RttV)) + + fMPY16SS(fGETUBYTE(6,RssV),fGETUBYTE(6,RttV)) + + fMPY16SS(fGETUBYTE(7,RssV),fGETUBYTE(7,RttV)))); + }) + +Q6INSN(M5_vrmacbuu,"Rxx32+=vrmpybu(Rss32,Rtt32)",ATTRIBS(), + "vector dual mac bytes", +{ + fSETWORD(0,RxxV,(fGETWORD(0,RxxV) + + fMPY16SS(fGETUBYTE(0,RssV),fGETUBYTE(0,RttV)) + + fMPY16SS(fGETUBYTE(1,RssV),fGETUBYTE(1,RttV)) + + fMPY16SS(fGETUBYTE(2,RssV),fGETUBYTE(2,RttV)) + + fMPY16SS(fGETUBYTE(3,RssV),fGETUBYTE(3,RttV)))); + fSETWORD(1,RxxV,(fGETWORD(1,RxxV) + + fMPY16SS(fGETUBYTE(4,RssV),fGETUBYTE(4,RttV)) + + fMPY16SS(fGETUBYTE(5,RssV),fGETUBYTE(5,RttV)) + + fMPY16SS(fGETUBYTE(6,RssV),fGETUBYTE(6,RttV)) + + fMPY16SS(fGETUBYTE(7,RssV),fGETUBYTE(7,RttV)))); + fACC();\ + }) + + +Q6INSN(M5_vrmpybsu,"Rdd32=vrmpybsu(Rss32,Rtt32)",ATTRIBS(), + "vector dual mpy bytes", +{ + fSETWORD(0,RddV,(fMPY16SS(fGETBYTE(0,RssV),fGETUBYTE(0,RttV)) + + fMPY16SS(fGETBYTE(1,RssV),fGETUBYTE(1,RttV)) + + fMPY16SS(fGETBYTE(2,RssV),fGETUBYTE(2,RttV)) + + fMPY16SS(fGETBYTE(3,RssV),fGETUBYTE(3,RttV)))); + fSETWORD(1,RddV,(fMPY16SS(fGETBYTE(4,RssV),fGETUBYTE(4,RttV)) + + fMPY16SS(fGETBYTE(5,RssV),fGETUBYTE(5,RttV)) + + fMPY16SS(fGETBYTE(6,RssV),fGETUBYTE(6,RttV)) + + fMPY16SS(fGETBYTE(7,RssV),fGETUBYTE(7,RttV)))); + }) + +Q6INSN(M5_vrmacbsu,"Rxx32+=vrmpybsu(Rss32,Rtt32)",ATTRIBS(), + "vector dual mac bytes", +{ + fSETWORD(0,RxxV,(fGETWORD(0,RxxV) + + fMPY16SS(fGETBYTE(0,RssV),fGETUBYTE(0,RttV)) + + fMPY16SS(fGETBYTE(1,RssV),fGETUBYTE(1,RttV)) + + fMPY16SS(fGETBYTE(2,RssV),fGETUBYTE(2,RttV)) + + fMPY16SS(fGETBYTE(3,RssV),fGETUBYTE(3,RttV)))); + fSETWORD(1,RxxV,(fGETWORD(1,RxxV) + + fMPY16SS(fGETBYTE(4,RssV),fGETUBYTE(4,RttV)) + + fMPY16SS(fGETBYTE(5,RssV),fGETUBYTE(5,RttV)) + + fMPY16SS(fGETBYTE(6,RssV),fGETUBYTE(6,RttV)) + + fMPY16SS(fGETBYTE(7,RssV),fGETUBYTE(7,RttV)))); + fACC();\ + }) + + +Q6INSN(M5_vmpybuu,"Rdd32=vmpybu(Rs32,Rt32)",ATTRIBS(), + "vector mpy bytes", +{ + fSETHALF(0,RddV,(fMPY16SS(fGETUBYTE(0,RsV),fGETUBYTE(0,RtV)))); + fSETHALF(1,RddV,(fMPY16SS(fGETUBYTE(1,RsV),fGETUBYTE(1,RtV)))); + fSETHALF(2,RddV,(fMPY16SS(fGETUBYTE(2,RsV),fGETUBYTE(2,RtV)))); + fSETHALF(3,RddV,(fMPY16SS(fGETUBYTE(3,RsV),fGETUBYTE(3,RtV)))); + }) + +Q6INSN(M5_vmpybsu,"Rdd32=vmpybsu(Rs32,Rt32)",ATTRIBS(), + "vector mpy bytes", +{ + fSETHALF(0,RddV,(fMPY16SS(fGETBYTE(0,RsV),fGETUBYTE(0,RtV)))); + fSETHALF(1,RddV,(fMPY16SS(fGETBYTE(1,RsV),fGETUBYTE(1,RtV)))); + fSETHALF(2,RddV,(fMPY16SS(fGETBYTE(2,RsV),fGETUBYTE(2,RtV)))); + fSETHALF(3,RddV,(fMPY16SS(fGETBYTE(3,RsV),fGETUBYTE(3,RtV)))); + }) + + +Q6INSN(M5_vmacbuu,"Rxx32+=vmpybu(Rs32,Rt32)",ATTRIBS(), + "vector mac bytes", +{ + fSETHALF(0,RxxV,(fGETHALF(0,RxxV)+fMPY16SS(fGETUBYTE(0,RsV),fGETUBYTE(0,RtV)))); + fSETHALF(1,RxxV,(fGETHALF(1,RxxV)+fMPY16SS(fGETUBYTE(1,RsV),fGETUBYTE(1,RtV)))); + fSETHALF(2,RxxV,(fGETHALF(2,RxxV)+fMPY16SS(fGETUBYTE(2,RsV),fGETUBYTE(2,RtV)))); + fSETHALF(3,RxxV,(fGETHALF(3,RxxV)+fMPY16SS(fGETUBYTE(3,RsV),fGETUBYTE(3,RtV)))); + fACC();\ + }) + +Q6INSN(M5_vmacbsu,"Rxx32+=vmpybsu(Rs32,Rt32)",ATTRIBS(), + "vector mac bytes", +{ + fSETHALF(0,RxxV,(fGETHALF(0,RxxV)+fMPY16SS(fGETBYTE(0,RsV),fGETUBYTE(0,RtV)))); + fSETHALF(1,RxxV,(fGETHALF(1,RxxV)+fMPY16SS(fGETBYTE(1,RsV),fGETUBYTE(1,RtV)))); + fSETHALF(2,RxxV,(fGETHALF(2,RxxV)+fMPY16SS(fGETBYTE(2,RsV),fGETUBYTE(2,RtV)))); + fSETHALF(3,RxxV,(fGETHALF(3,RxxV)+fMPY16SS(fGETBYTE(3,RsV),fGETUBYTE(3,RtV)))); + fACC();\ + }) + + + +Q6INSN(M5_vdmpybsu,"Rdd32=vdmpybsu(Rss32,Rtt32):sat",ATTRIBS(), + "vector quad mpy bytes", +{ + fSETHALF(0,RddV,fSATN(16,(fMPY16SS(fGETBYTE(0,RssV),fGETUBYTE(0,RttV)) + + fMPY16SS(fGETBYTE(1,RssV),fGETUBYTE(1,RttV))))); + fSETHALF(1,RddV,fSATN(16,(fMPY16SS(fGETBYTE(2,RssV),fGETUBYTE(2,RttV)) + + fMPY16SS(fGETBYTE(3,RssV),fGETUBYTE(3,RttV))))); + fSETHALF(2,RddV,fSATN(16,(fMPY16SS(fGETBYTE(4,RssV),fGETUBYTE(4,RttV)) + + fMPY16SS(fGETBYTE(5,RssV),fGETUBYTE(5,RttV))))); + fSETHALF(3,RddV,fSATN(16,(fMPY16SS(fGETBYTE(6,RssV),fGETUBYTE(6,RttV)) + + fMPY16SS(fGETBYTE(7,RssV),fGETUBYTE(7,RttV))))); + }) + + +Q6INSN(M5_vdmacbsu,"Rxx32+=vdmpybsu(Rss32,Rtt32):sat",ATTRIBS(), + "vector quad mac bytes", +{ + fSETHALF(0,RxxV,fSATN(16,(fGETHALF(0,RxxV) + + fMPY16SS(fGETBYTE(0,RssV),fGETUBYTE(0,RttV)) + + fMPY16SS(fGETBYTE(1,RssV),fGETUBYTE(1,RttV))))); + fSETHALF(1,RxxV,fSATN(16,(fGETHALF(1,RxxV) + + fMPY16SS(fGETBYTE(2,RssV),fGETUBYTE(2,RttV)) + + fMPY16SS(fGETBYTE(3,RssV),fGETUBYTE(3,RttV))))); + fSETHALF(2,RxxV,fSATN(16,(fGETHALF(2,RxxV) + + fMPY16SS(fGETBYTE(4,RssV),fGETUBYTE(4,RttV)) + + fMPY16SS(fGETBYTE(5,RssV),fGETUBYTE(5,RttV))))); + fSETHALF(3,RxxV,fSATN(16,(fGETHALF(3,RxxV) + + fMPY16SS(fGETBYTE(6,RssV),fGETUBYTE(6,RttV)) + + fMPY16SS(fGETBYTE(7,RssV),fGETUBYTE(7,RttV))))); + fACC();\ + }) + + + +/* Full version */ +#undef dmpy_sema +#define dmpy_sema(N)\ +{ fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV))) + \ + fSCALE(N,fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV)))));\ + fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV))) + \ + fSCALE(N,fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV)))));\ + fACC();\ +} +Q6INSN(M2_vdmacs_s0,"Rxx32+=vdmpy(Rss32,Rtt32):sat",ATTRIBS(), "",dmpy_sema(0)) +Q6INSN(M2_vdmacs_s1,"Rxx32+=vdmpy(Rss32,Rtt32):<<1:sat",ATTRIBS(),"",dmpy_sema(1)) + +#undef dmpy_sema +#define dmpy_sema(N)\ +{ fSETWORD(0,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV))) + \ + fSCALE(N,fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV)))));\ + fSETWORD(1,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV))) + \ + fSCALE(N,fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV)))));\ +} + +Q6INSN(M2_vdmpys_s0,"Rdd32=vdmpy(Rss32,Rtt32):sat",ATTRIBS(), "",dmpy_sema(0)) +Q6INSN(M2_vdmpys_s1,"Rdd32=vdmpy(Rss32,Rtt32):<<1:sat",ATTRIBS(),"",dmpy_sema(1)) + + + +/******************************************************/ +/* complex multiply/mac with */ +/* real&imag are packed together and always saturated */ +/* to protect against overflow. */ +/******************************************************/ + +#undef cmpy_sema +#define cmpy_sema(N,CONJMINUS,CONJPLUS)\ +{ fSETHALF(1,RdV,fGETHALF(1,(fSAT(fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(0,RtV))) CONJMINUS \ + fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(1,RtV))) + 0x8000))));\ + fSETHALF(0,RdV,fGETHALF(1,(fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV))) CONJPLUS \ + fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV))) + 0x8000))));\ +} +Q6INSN(M2_cmpyrs_s0,"Rd32=cmpy(Rs32,Rt32):rnd:sat",ATTRIBS(), "Complex Multiply",cmpy_sema(0,+,-)) +Q6INSN(M2_cmpyrs_s1,"Rd32=cmpy(Rs32,Rt32):<<1:rnd:sat",ATTRIBS(),"Complex Multiply",cmpy_sema(1,+,-)) + + +Q6INSN(M2_cmpyrsc_s0,"Rd32=cmpy(Rs32,Rt32*):rnd:sat",ATTRIBS(A_ARCHV2), "Complex Multiply",cmpy_sema(0,-,+)) +Q6INSN(M2_cmpyrsc_s1,"Rd32=cmpy(Rs32,Rt32*):<<1:rnd:sat",ATTRIBS(A_ARCHV2),"Complex Multiply",cmpy_sema(1,-,+)) + + +#undef cmpy_sema +#define cmpy_sema(N,CONJMINUS,CONJPLUS)\ +{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(0,RtV))) CONJMINUS \ + fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(1,RtV)))));\ + fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV))) CONJPLUS \ + fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV)))));\ + fACC();\ +} +Q6INSN(M2_cmacs_s0,"Rxx32+=cmpy(Rs32,Rt32):sat",ATTRIBS(), "Complex Multiply",cmpy_sema(0,+,-)) +Q6INSN(M2_cmacs_s1,"Rxx32+=cmpy(Rs32,Rt32):<<1:sat",ATTRIBS(),"Complex Multiply",cmpy_sema(1,+,-)) + +/* EJP: Need mac versions w/ CONJ T? */ +Q6INSN(M2_cmacsc_s0,"Rxx32+=cmpy(Rs32,Rt32*):sat",ATTRIBS(A_ARCHV2), "Complex Multiply",cmpy_sema(0,-,+)) +Q6INSN(M2_cmacsc_s1,"Rxx32+=cmpy(Rs32,Rt32*):<<1:sat",ATTRIBS(A_ARCHV2),"Complex Multiply",cmpy_sema(1,-,+)) + + +#undef cmpy_sema +#define cmpy_sema(N,CONJMINUS,CONJPLUS)\ +{ fSETWORD(1,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(0,RtV))) CONJMINUS \ + fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(1,RtV)))));\ + fSETWORD(0,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV))) CONJPLUS \ + fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV)))));\ +} + +Q6INSN(M2_cmpys_s0,"Rdd32=cmpy(Rs32,Rt32):sat",ATTRIBS(), "Complex Multiply",cmpy_sema(0,+,-)) +Q6INSN(M2_cmpys_s1,"Rdd32=cmpy(Rs32,Rt32):<<1:sat",ATTRIBS(),"Complex Multiply",cmpy_sema(1,+,-)) + +Q6INSN(M2_cmpysc_s0,"Rdd32=cmpy(Rs32,Rt32*):sat",ATTRIBS(A_ARCHV2), "Complex Multiply",cmpy_sema(0,-,+)) +Q6INSN(M2_cmpysc_s1,"Rdd32=cmpy(Rs32,Rt32*):<<1:sat",ATTRIBS(A_ARCHV2),"Complex Multiply",cmpy_sema(1,-,+)) + + + +#undef cmpy_sema +#define cmpy_sema(N,CONJMINUS,CONJPLUS)\ +{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) - (fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(0,RtV))) CONJMINUS \ + fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(1,RtV))))));\ + fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) - (fSCALE(N,fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV))) CONJPLUS \ + fSCALE(N,fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV))))));\ + fACC();\ +} +Q6INSN(M2_cnacs_s0,"Rxx32-=cmpy(Rs32,Rt32):sat",ATTRIBS(A_ARCHV2), "Complex Multiply",cmpy_sema(0,+,-)) +Q6INSN(M2_cnacs_s1,"Rxx32-=cmpy(Rs32,Rt32):<<1:sat",ATTRIBS(A_ARCHV2),"Complex Multiply",cmpy_sema(1,+,-)) + +/* EJP: need CONJ versions? */ +Q6INSN(M2_cnacsc_s0,"Rxx32-=cmpy(Rs32,Rt32*):sat",ATTRIBS(A_ARCHV2), "Complex Multiply",cmpy_sema(0,-,+)) +Q6INSN(M2_cnacsc_s1,"Rxx32-=cmpy(Rs32,Rt32*):<<1:sat",ATTRIBS(A_ARCHV2),"Complex Multiply",cmpy_sema(1,-,+)) + + +/******************************************************/ +/* complex interpolation */ +/* Given a pair of complex values, scale by a,b, sum */ +/* Saturate/shift1 and round/pack */ +/******************************************************/ + +#undef vrcmpys_sema +#define vrcmpys_sema(N,INWORD) \ +{ fSETWORD(1,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,INWORD))) + \ + fSCALE(N,fMPY16SS(fGETHALF(3,RssV),fGETHALF(1,INWORD)))));\ + fSETWORD(0,RddV,fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,INWORD))) + \ + fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(1,INWORD)))));\ +} + + + +Q6INSN(M2_vrcmpys_s1_h,"Rdd32=vrcmpys(Rss32,Rtt32):<<1:sat:raw:hi",ATTRIBS(A_ARCHV3), "Vector Reduce Complex Multiply by Scalar",vrcmpys_sema(1,fGETWORD(1,RttV))) +Q6INSN(M2_vrcmpys_s1_l,"Rdd32=vrcmpys(Rss32,Rtt32):<<1:sat:raw:lo",ATTRIBS(A_ARCHV3), "Vector Reduce Complex Multiply by Scalar",vrcmpys_sema(1,fGETWORD(0,RttV))) + +DEF_V3_COND_MAPPING(M2_vrcmpys_s1,"Rdd32=vrcmpys(Rss32,Rt32):<<1:sat","Rt32 & 1","Rdd32=vrcmpys(Rss32,Rtt32):<<1:sat:raw:hi","Rdd32=vrcmpys(Rss32,Rtt32):<<1:sat:raw:lo") + +#undef vrcmpys_sema +#define vrcmpys_sema(N,INWORD) \ +{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,INWORD))) + \ + fSCALE(N,fMPY16SS(fGETHALF(3,RssV),fGETHALF(1,INWORD)))));\ + fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,INWORD))) + \ + fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(1,INWORD)))));\ + fACC();\ +} + + + +Q6INSN(M2_vrcmpys_acc_s1_h,"Rxx32+=vrcmpys(Rss32,Rtt32):<<1:sat:raw:hi",ATTRIBS(A_ARCHV3), "Vector Reduce Complex Multiply by Scalar",vrcmpys_sema(1,fGETWORD(1,RttV))) +Q6INSN(M2_vrcmpys_acc_s1_l,"Rxx32+=vrcmpys(Rss32,Rtt32):<<1:sat:raw:lo",ATTRIBS(A_ARCHV3), "Vector Reduce Complex Multiply by Scalar",vrcmpys_sema(1,fGETWORD(0,RttV))) + +DEF_V3_COND_MAPPING(M2_vrcmpys_acc_s1,"Rxx32+=vrcmpys(Rss32,Rt32):<<1:sat","Rt32 & 1","Rxx32+=vrcmpys(Rss32,Rtt32):<<1:sat:raw:hi","Rxx32+=vrcmpys(Rss32,Rtt32):<<1:sat:raw:lo") + + +#undef vrcmpys_sema +#define vrcmpys_sema(N,INWORD) \ +{ fSETHALF(1,RdV,fGETHALF(1,fSAT(fSCALE(N,fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,INWORD))) + \ + fSCALE(N,fMPY16SS(fGETHALF(3,RssV),fGETHALF(1,INWORD))) + 0x8000)));\ + fSETHALF(0,RdV,fGETHALF(1,fSAT(fSCALE(N,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,INWORD))) + \ + fSCALE(N,fMPY16SS(fGETHALF(2,RssV),fGETHALF(1,INWORD))) + 0x8000)));\ +} + +Q6INSN(M2_vrcmpys_s1rp_h,"Rd32=vrcmpys(Rss32,Rtt32):<<1:rnd:sat:raw:hi",ATTRIBS(A_ARCHV3), "Vector Reduce Complex Multiply by Scalar",vrcmpys_sema(1,fGETWORD(1,RttV))) +Q6INSN(M2_vrcmpys_s1rp_l,"Rd32=vrcmpys(Rss32,Rtt32):<<1:rnd:sat:raw:lo",ATTRIBS(A_ARCHV3), "Vector Reduce Complex Multiply by Scalar",vrcmpys_sema(1,fGETWORD(0,RttV))) + +DEF_V3_COND_MAPPING(M2_vrcmpys_s1rp,"Rd32=vrcmpys(Rss32,Rt32):<<1:rnd:sat","Rt32 & 1","Rd32=vrcmpys(Rss32,Rtt32):<<1:rnd:sat:raw:hi","Rd32=vrcmpys(Rss32,Rtt32):<<1:rnd:sat:raw:lo") + +#if 1 +/**************************************************************/ +/* mixed mode 32x16 vector dual multiplies */ +/* */ +/**************************************************************/ + +/* SIGNED 32 x SIGNED 16 */ + + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(2,RttV))))>>16)) ); \ + fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RttV))))>>16)) ); \ + fACC();\ +} +Q6INSN(M2_mmacls_s0,"Rxx32+=vmpyweh(Rss32,Rtt32):sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmacls_s1,"Rxx32+=vmpyweh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(3,RttV))))>>16) )); \ + fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RttV))))>>16 ))); \ + fACC();\ +} +Q6INSN(M2_mmachs_s0,"Rxx32+=vmpywoh(Rss32,Rtt32):sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmachs_s1,"Rxx32+=vmpywoh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(2,RttV))))>>16)); \ + fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RttV))))>>16)); \ +} +Q6INSN(M2_mmpyl_s0,"Rdd32=vmpyweh(Rss32,Rtt32):sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmpyl_s1,"Rdd32=vmpyweh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(3,RttV))))>>16)); \ + fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RttV))))>>16)); \ +} +Q6INSN(M2_mmpyh_s0,"Rdd32=vmpywoh(Rss32,Rtt32):sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmpyh_s1,"Rdd32=vmpywoh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + + +/* With rounding */ + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(2,RttV)))+0x8000)>>16)) ); \ + fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RttV)))+0x8000)>>16)) ); \ + fACC();\ +} +Q6INSN(M2_mmacls_rs0,"Rxx32+=vmpyweh(Rss32,Rtt32):rnd:sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmacls_rs1,"Rxx32+=vmpyweh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(3,RttV)))+0x8000)>>16) )); \ + fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RttV)))+0x8000)>>16 ))); \ + fACC();\ +} +Q6INSN(M2_mmachs_rs0,"Rxx32+=vmpywoh(Rss32,Rtt32):rnd:sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmachs_rs1,"Rxx32+=vmpywoh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(2,RttV)))+0x8000)>>16)); \ + fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RttV)))+0x8000)>>16)); \ +} +Q6INSN(M2_mmpyl_rs0,"Rdd32=vmpyweh(Rss32,Rtt32):rnd:sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmpyl_rs1,"Rdd32=vmpyweh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(3,RttV)))+0x8000)>>16)); \ + fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RttV)))+0x8000)>>16)); \ +} +Q6INSN(M2_mmpyh_rs0,"Rdd32=vmpywoh(Rss32,Rtt32):rnd:sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmpyh_rs1,"Rdd32=vmpywoh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + + +#undef mixmpy_sema +#define mixmpy_sema(DEST,EQUALS,N,ACCSYN)\ +{ DEST EQUALS fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(2,RttV))) + fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RttV))); ACCSYN;} + +Q6INSN(M4_vrmpyeh_s0,"Rdd32=vrmpyweh(Rss32,Rtt32)",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(RddV,=,0,)) +Q6INSN(M4_vrmpyeh_s1,"Rdd32=vrmpyweh(Rss32,Rtt32):<<1",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(RddV,=,1,)) +Q6INSN(M4_vrmpyeh_acc_s0,"Rxx32+=vrmpyweh(Rss32,Rtt32)",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(RxxV,+=,0,fACC())) +Q6INSN(M4_vrmpyeh_acc_s1,"Rxx32+=vrmpyweh(Rss32,Rtt32):<<1",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(RxxV,+=,1,fACC())) + +#undef mixmpy_sema +#define mixmpy_sema(DEST,EQUALS,N,ACCSYN)\ +{ DEST EQUALS fSCALE(N,fMPY3216SS(fGETWORD(1,RssV),fGETHALF(3,RttV))) + fSCALE(N,fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RttV))); ACCSYN;} + +Q6INSN(M4_vrmpyoh_s0,"Rdd32=vrmpywoh(Rss32,Rtt32)",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(RddV,=,0,)) +Q6INSN(M4_vrmpyoh_s1,"Rdd32=vrmpywoh(Rss32,Rtt32):<<1",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(RddV,=,1,)) +Q6INSN(M4_vrmpyoh_acc_s0,"Rxx32+=vrmpywoh(Rss32,Rtt32)",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(RxxV,+=,0,fACC())) +Q6INSN(M4_vrmpyoh_acc_s1,"Rxx32+=vrmpywoh(Rss32,Rtt32):<<1",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(RxxV,+=,1,fACC())) + + + + + + +#undef mixmpy_sema +#define mixmpy_sema(N,H,RND)\ +{ RdV = fSAT((fSCALE(N,fMPY3216SS(RsV,fGETHALF(H,RtV)))RND)>>16); \ +} +Q6INSN(M2_hmmpyl_rs1,"Rd32=mpy(Rs32,Rt.L32):<<1:rnd:sat",ATTRIBS(A_ARCHV2),"Mixed Precision Multiply",mixmpy_sema(1,0,+0x8000)) +Q6INSN(M2_hmmpyh_rs1,"Rd32=mpy(Rs32,Rt.H32):<<1:rnd:sat",ATTRIBS(A_ARCHV2),"Mixed Precision Multiply",mixmpy_sema(1,1,+0x8000)) +Q6INSN(M2_hmmpyl_s1,"Rd32=mpy(Rs32,Rt.L32):<<1:sat",ATTRIBS(A_ARCHV2),"Mixed Precision Multiply",mixmpy_sema(1,0,)) +Q6INSN(M2_hmmpyh_s1,"Rd32=mpy(Rs32,Rt.H32):<<1:sat",ATTRIBS(A_ARCHV2),"Mixed Precision Multiply",mixmpy_sema(1,1,)) + + + + + + + + + +/* SIGNED 32 x UNSIGNED 16 */ + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(2,RttV))))>>16)) ); \ + fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(0,RttV))))>>16)) ); \ + fACC();\ +} +Q6INSN(M2_mmaculs_s0,"Rxx32+=vmpyweuh(Rss32,Rtt32):sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmaculs_s1,"Rxx32+=vmpyweuh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(3,RttV))))>>16) )); \ + fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(1,RttV))))>>16 ))); \ + fACC();\ +} +Q6INSN(M2_mmacuhs_s0,"Rxx32+=vmpywouh(Rss32,Rtt32):sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmacuhs_s1,"Rxx32+=vmpywouh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(2,RttV))))>>16)); \ + fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(0,RttV))))>>16)); \ +} +Q6INSN(M2_mmpyul_s0,"Rdd32=vmpyweuh(Rss32,Rtt32):sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmpyul_s1,"Rdd32=vmpyweuh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(3,RttV))))>>16)); \ + fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(1,RttV))))>>16)); \ +} +Q6INSN(M2_mmpyuh_s0,"Rdd32=vmpywouh(Rss32,Rtt32):sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmpyuh_s1,"Rdd32=vmpywouh(Rss32,Rtt32):<<1:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + + +/* With rounding */ + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(2,RttV)))+0x8000)>>16)) ); \ + fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(0,RttV)))+0x8000)>>16)) ); \ + fACC();\ +} +Q6INSN(M2_mmaculs_rs0,"Rxx32+=vmpyweuh(Rss32,Rtt32):rnd:sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmaculs_rs1,"Rxx32+=vmpyweuh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RxxV,fSAT(fGETWORD(1,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(3,RttV)))+0x8000)>>16) )); \ + fSETWORD(0,RxxV,fSAT(fGETWORD(0,RxxV) + ((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(1,RttV)))+0x8000)>>16 ))); \ + fACC();\ +} +Q6INSN(M2_mmacuhs_rs0,"Rxx32+=vmpywouh(Rss32,Rtt32):rnd:sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmacuhs_rs1,"Rxx32+=vmpywouh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(2,RttV)))+0x8000)>>16)); \ + fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(0,RttV)))+0x8000)>>16)); \ +} +Q6INSN(M2_mmpyul_rs0,"Rdd32=vmpyweuh(Rss32,Rtt32):rnd:sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmpyul_rs1,"Rdd32=vmpyweuh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + +#undef mixmpy_sema +#define mixmpy_sema(N)\ +{ fSETWORD(1,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(1,RssV),fGETUHALF(3,RttV)))+0x8000)>>16)); \ + fSETWORD(0,RddV,fSAT((fSCALE(N,fMPY3216SU(fGETWORD(0,RssV),fGETUHALF(1,RttV)))+0x8000)>>16)); \ +} +Q6INSN(M2_mmpyuh_rs0,"Rdd32=vmpywouh(Rss32,Rtt32):rnd:sat",ATTRIBS(), "Mixed Precision Multiply",mixmpy_sema(0)) +Q6INSN(M2_mmpyuh_rs1,"Rdd32=vmpywouh(Rss32,Rtt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Multiply",mixmpy_sema(1)) + + +#endif + + + +/**************************************************************/ +/* complex mac with full 64-bit accum - no sat, no shift */ +/* either do real or accum, never both */ +/**************************************************************/ + +Q6INSN(M2_vrcmaci_s0,"Rxx32+=vrcmpyi(Rss32,Rtt32)",ATTRIBS(),"Vector Complex Mac Imaginary", +{ +RxxV = RxxV + fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,RttV)) + \ + fMPY16SS(fGETHALF(0,RssV),fGETHALF(1,RttV)) + \ + fMPY16SS(fGETHALF(3,RssV),fGETHALF(2,RttV)) + \ + fMPY16SS(fGETHALF(2,RssV),fGETHALF(3,RttV));\ +fACC();\ +}) + +Q6INSN(M2_vrcmacr_s0,"Rxx32+=vrcmpyr(Rss32,Rtt32)",ATTRIBS(),"Vector Complex Mac Real", +{ RxxV = RxxV + fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)) - \ + fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV)) + \ + fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)) - \ + fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV));\ +fACC();\ +}) + +Q6INSN(M2_vrcmaci_s0c,"Rxx32+=vrcmpyi(Rss32,Rtt32*)",ATTRIBS(A_ARCHV2),"Vector Complex Mac Imaginary", +{ +RxxV = RxxV + fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,RttV)) - \ + fMPY16SS(fGETHALF(0,RssV),fGETHALF(1,RttV)) + \ + fMPY16SS(fGETHALF(3,RssV),fGETHALF(2,RttV)) - \ + fMPY16SS(fGETHALF(2,RssV),fGETHALF(3,RttV));\ +fACC();\ +}) + +Q6INSN(M2_vrcmacr_s0c,"Rxx32+=vrcmpyr(Rss32,Rtt32*)",ATTRIBS(A_ARCHV2),"Vector Complex Mac Real", +{ RxxV = RxxV + fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)) + \ + fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV)) + \ + fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)) + \ + fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV));\ +fACC();\ +}) + +Q6INSN(M2_cmaci_s0,"Rxx32+=cmpyi(Rs32,Rt32)",ATTRIBS(),"Vector Complex Mac Imaginary", +{ +RxxV = RxxV + fMPY16SS(fGETHALF(1,RsV),fGETHALF(0,RtV)) + \ + fMPY16SS(fGETHALF(0,RsV),fGETHALF(1,RtV)); +fACC();\ +}) + +Q6INSN(M2_cmacr_s0,"Rxx32+=cmpyr(Rs32,Rt32)",ATTRIBS(),"Vector Complex Mac Real", +{ RxxV = RxxV + fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV)) - \ + fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV)); +fACC();\ +}) + + +Q6INSN(M2_vrcmpyi_s0,"Rdd32=vrcmpyi(Rss32,Rtt32)",ATTRIBS(),"Vector Complex Mpy Imaginary", +{ +RddV = fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,RttV)) + \ + fMPY16SS(fGETHALF(0,RssV),fGETHALF(1,RttV)) + \ + fMPY16SS(fGETHALF(3,RssV),fGETHALF(2,RttV)) + \ + fMPY16SS(fGETHALF(2,RssV),fGETHALF(3,RttV));\ +}) + +Q6INSN(M2_vrcmpyr_s0,"Rdd32=vrcmpyr(Rss32,Rtt32)",ATTRIBS(),"Vector Complex Mpy Real", +{ RddV = fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)) - \ + fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV)) + \ + fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)) - \ + fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV));\ +}) + +Q6INSN(M2_vrcmpyi_s0c,"Rdd32=vrcmpyi(Rss32,Rtt32*)",ATTRIBS(A_ARCHV2),"Vector Complex Mpy Imaginary", +{ +RddV = fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,RttV)) - \ + fMPY16SS(fGETHALF(0,RssV),fGETHALF(1,RttV)) + \ + fMPY16SS(fGETHALF(3,RssV),fGETHALF(2,RttV)) - \ + fMPY16SS(fGETHALF(2,RssV),fGETHALF(3,RttV));\ +}) + +Q6INSN(M2_vrcmpyr_s0c,"Rdd32=vrcmpyr(Rss32,Rtt32*)",ATTRIBS(A_ARCHV2),"Vector Complex Mpy Real", +{ RddV = fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)) + \ + fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV)) + \ + fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)) + \ + fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV));\ +}) + +Q6INSN(M2_cmpyi_s0,"Rdd32=cmpyi(Rs32,Rt32)",ATTRIBS(),"Vector Complex Mpy Imaginary", +{ +RddV = fMPY16SS(fGETHALF(1,RsV),fGETHALF(0,RtV)) + \ + fMPY16SS(fGETHALF(0,RsV),fGETHALF(1,RtV)); +}) + +Q6INSN(M2_cmpyr_s0,"Rdd32=cmpyr(Rs32,Rt32)",ATTRIBS(),"Vector Complex Mpy Real", +{ RddV = fMPY16SS(fGETHALF(0,RsV),fGETHALF(0,RtV)) - \ + fMPY16SS(fGETHALF(1,RsV),fGETHALF(1,RtV)); +}) + + +/**************************************************************/ +/* Complex mpy/mac with 2x32 bit accum, sat, shift */ +/* 32x16 real or imag */ +/**************************************************************/ + +#if 1 + +Q6INSN(M4_cmpyi_wh,"Rd32=cmpyiwh(Rss32,Rt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Complex Multiply", +{ + RdV = fSAT( ( fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RtV)) + + fMPY3216SS(fGETWORD(1,RssV),fGETHALF(0,RtV)) + + 0x4000)>>15); +}) + + +Q6INSN(M4_cmpyr_wh,"Rd32=cmpyrwh(Rss32,Rt32):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Complex Multiply", +{ + RdV = fSAT( ( fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RtV)) + - fMPY3216SS(fGETWORD(1,RssV),fGETHALF(1,RtV)) + + 0x4000)>>15); +}) + +Q6INSN(M4_cmpyi_whc,"Rd32=cmpyiwh(Rss32,Rt32*):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Complex Multiply", +{ + RdV = fSAT( ( fMPY3216SS(fGETWORD(1,RssV),fGETHALF(0,RtV)) + - fMPY3216SS(fGETWORD(0,RssV),fGETHALF(1,RtV)) + + 0x4000)>>15); +}) + + +Q6INSN(M4_cmpyr_whc,"Rd32=cmpyrwh(Rss32,Rt32*):<<1:rnd:sat",ATTRIBS(),"Mixed Precision Complex Multiply", +{ + RdV = fSAT( ( fMPY3216SS(fGETWORD(0,RssV),fGETHALF(0,RtV)) + + fMPY3216SS(fGETWORD(1,RssV),fGETHALF(1,RtV)) + + 0x4000)>>15); +}) + + +#endif + +/**************************************************************/ +/* Vector mpy/mac with 2x32 bit accum, sat, shift */ +/* either do real or imag, never both */ +/**************************************************************/ + +#undef VCMPYSEMI +#define VCMPYSEMI(DST,ACC0,ACC1,SHIFT,SAT,ACCSYN) \ + fSETWORD(0,DST,SAT(ACC0 fSCALE(SHIFT,fMPY16SS(fGETHALF(1,RssV),fGETHALF(0,RttV)) + \ + fMPY16SS(fGETHALF(0,RssV),fGETHALF(1,RttV))))); \ + fSETWORD(1,DST,SAT(ACC1 fSCALE(SHIFT,fMPY16SS(fGETHALF(3,RssV),fGETHALF(2,RttV)) + \ + fMPY16SS(fGETHALF(2,RssV),fGETHALF(3,RttV))))); \ + ACCSYN;\ + +#undef VCMPYSEMR +#define VCMPYSEMR(DST,ACC0,ACC1,SHIFT,SAT,ACCSYN) \ + fSETWORD(0,DST,SAT(ACC0 fSCALE(SHIFT,fMPY16SS(fGETHALF(0,RssV),fGETHALF(0,RttV)) - \ + fMPY16SS(fGETHALF(1,RssV),fGETHALF(1,RttV))))); \ + fSETWORD(1,DST,SAT(ACC1 fSCALE(SHIFT,fMPY16SS(fGETHALF(2,RssV),fGETHALF(2,RttV)) - \ + fMPY16SS(fGETHALF(3,RssV),fGETHALF(3,RttV))))); \ + ACCSYN;\ + + +#undef VCMPYIR +#define VCMPYIR(TAGBASE,DSTSYN,DSTVAL,ACCSEM,ACCVAL0,ACCVAL1,SHIFTSYN,SHIFTVAL,SATSYN,SATVAL,ACCSYN) \ +Q6INSN(M2_##TAGBASE##i,DSTSYN ACCSEM "vcmpyi(Rss32,Rtt32)" SHIFTSYN SATSYN,ATTRIBS(A_ARCHV2), \ + "Vector Complex Multiply Imaginary", { VCMPYSEMI(DSTVAL,ACCVAL0,ACCVAL1,SHIFTVAL,SATVAL,ACCSYN); }) \ +Q6INSN(M2_##TAGBASE##r,DSTSYN ACCSEM "vcmpyr(Rss32,Rtt32)" SHIFTSYN SATSYN,ATTRIBS(A_ARCHV2), \ + "Vector Complex Multiply Imaginary", { VCMPYSEMR(DSTVAL,ACCVAL0,ACCVAL1,SHIFTVAL,SATVAL,ACCSYN); }) + + +VCMPYIR(vcmpy_s0_sat_,"Rdd32",RddV,"=",,,"",0,":sat",fSAT,) +VCMPYIR(vcmpy_s1_sat_,"Rdd32",RddV,"=",,,":<<1",1,":sat",fSAT,) +VCMPYIR(vcmac_s0_sat_,"Rxx32",RxxV,"+=",fGETWORD(0,RxxV) + ,fGETWORD(1,RxxV) + ,"",0,":sat",fSAT,fACC()) + + +/********************************************************************** + * Rotation -- by 0, 90, 180, or 270 means mult by 1, J, -1, -J * + *********************************************************************/ + +Q6INSN(S2_vcrotate,"Rdd32=vcrotate(Rss32,Rt32)",ATTRIBS(A_ARCHV2),"Rotate complex value by multiple of PI/2", +{ + fHIDE(size1u_t tmp;) + tmp = fEXTRACTU_RANGE(RtV,1,0); + if (tmp == 0) { /* No rotation */ + fSETHALF(0,RddV,fGETHALF(0,RssV)); + fSETHALF(1,RddV,fGETHALF(1,RssV)); + } else if (tmp == 1) { /* Multiply by -J */ + fSETHALF(0,RddV,fGETHALF(1,RssV)); + fSETHALF(1,RddV,fSATH(-fGETHALF(0,RssV))); + } else if (tmp == 2) { /* Multiply by J */ + fSETHALF(0,RddV,fSATH(-fGETHALF(1,RssV))); + fSETHALF(1,RddV,fGETHALF(0,RssV)); + } else { /* Multiply by -1 */ + fHIDE(if (tmp != 3) fatal("C is broken");) + fSETHALF(0,RddV,fSATH(-fGETHALF(0,RssV))); + fSETHALF(1,RddV,fSATH(-fGETHALF(1,RssV))); + } + tmp = fEXTRACTU_RANGE(RtV,3,2); + if (tmp == 0) { /* No rotation */ + fSETHALF(2,RddV,fGETHALF(2,RssV)); + fSETHALF(3,RddV,fGETHALF(3,RssV)); + } else if (tmp == 1) { /* Multiply by -J */ + fSETHALF(2,RddV,fGETHALF(3,RssV)); + fSETHALF(3,RddV,fSATH(-fGETHALF(2,RssV))); + } else if (tmp == 2) { /* Multiply by J */ + fSETHALF(2,RddV,fSATH(-fGETHALF(3,RssV))); + fSETHALF(3,RddV,fGETHALF(2,RssV)); + } else { /* Multiply by -1 */ + fHIDE(if (tmp != 3) fatal("C is broken");) + fSETHALF(2,RddV,fSATH(-fGETHALF(2,RssV))); + fSETHALF(3,RddV,fSATH(-fGETHALF(3,RssV))); + } +}) + + +Q6INSN(S4_vrcrotate_acc,"Rxx32+=vrcrotate(Rss32,Rt32,#u2)",ATTRIBS(),"Rotate and Reduce Bytes", +{ + fHIDE(int i; int tmpr; int tmpi; unsigned int control;) + fHIDE(int sumr; int sumi;) + sumr = 0; + sumi = 0; + control = fGETUBYTE(uiV,RtV); + for (i = 0; i < 8; i += 2) { + tmpr = fGETBYTE(i ,RssV); + tmpi = fGETBYTE(i+1,RssV); + switch (control & 3) { + case 0: /* No Rotation */ + sumr += tmpr; + sumi += tmpi; + break; + case 1: /* Multiply by -J */ + sumr += tmpi; + sumi -= tmpr; + break; + case 2: /* Multiply by J */ + sumr -= tmpi; + sumi += tmpr; + break; + case 3: /* Multiply by -1 */ + sumr -= tmpr; + sumi -= tmpi; + break; + fHIDE(default: fatal("C is broken!");) + } + control = control >> 2; + } + fSETWORD(0,RxxV,fGETWORD(0,RxxV) + sumr); + fSETWORD(1,RxxV,fGETWORD(1,RxxV) + sumi); + fACC();\ +}) + +Q6INSN(S4_vrcrotate,"Rdd32=vrcrotate(Rss32,Rt32,#u2)",ATTRIBS(),"Rotate and Reduce Bytes", +{ + fHIDE(int i; int tmpr; int tmpi; unsigned int control;) + fHIDE(int sumr; int sumi;) + sumr = 0; + sumi = 0; + control = fGETUBYTE(uiV,RtV); + for (i = 0; i < 8; i += 2) { + tmpr = fGETBYTE(i ,RssV); + tmpi = fGETBYTE(i+1,RssV); + switch (control & 3) { + case 0: /* No Rotation */ + sumr += tmpr; + sumi += tmpi; + break; + case 1: /* Multiply by -J */ + sumr += tmpi; + sumi -= tmpr; + break; + case 2: /* Multiply by J */ + sumr -= tmpi; + sumi += tmpr; + break; + case 3: /* Multiply by -1 */ + sumr -= tmpr; + sumi -= tmpi; + break; + fHIDE(default: fatal("C is broken!");) + } + control = control >> 2; + } + fSETWORD(0,RddV,sumr); + fSETWORD(1,RddV,sumi); +}) + + +Q6INSN(S2_vcnegh,"Rdd32=vcnegh(Rss32,Rt32)",ATTRIBS(),"Conditional Negate halfwords", +{ + fHIDE(int i;) + for (i = 0; i < 4; i++) { + if (fGETBIT(i,RtV)) { + fSETHALF(i,RddV,fSATH(-fGETHALF(i,RssV))); + } else { + fSETHALF(i,RddV,fGETHALF(i,RssV)); + } + } +}) + +Q6INSN(S2_vrcnegh,"Rxx32+=vrcnegh(Rss32,Rt32)",ATTRIBS(),"Vector Reduce Conditional Negate halfwords", +{ + fHIDE(int i;) + for (i = 0; i < 4; i++) { + if (fGETBIT(i,RtV)) { + RxxV += -fGETHALF(i,RssV); + } else { + RxxV += fGETHALF(i,RssV); + } + } + fACC(); +}) + + +/********************************************************************** + * Finite-field multiplies. Written by David Hoyle * + *********************************************************************/ + +Q6INSN(M4_pmpyw,"Rdd32=pmpyw(Rs32,Rt32)",ATTRIBS(),"Polynomial 32bit Multiplication with Addition in GF(2)", +{ + fHIDE(int i; unsigned int y;) + fHIDE(unsigned long long x; unsigned long long prod;) + x = fGETUWORD(0, RsV); + y = fGETUWORD(0, RtV); + + prod = 0; + for(i=0; i < 32; i++) { + if((y >> i) & 1) prod ^= (x << i); + } + RddV = prod; +}) + +Q6INSN(M4_vpmpyh,"Rdd32=vpmpyh(Rs32,Rt32)",ATTRIBS(),"Dual Polynomial 16bit Multiplication with Addition in GF(2)", +{ + fHIDE(int i; unsigned int x0; unsigned int x1;) + fHIDE(unsigned int y0; unsigned int y1;) + fHIDE(unsigned int prod0; unsigned int prod1;) + + x0 = fGETUHALF(0, RsV); + x1 = fGETUHALF(1, RsV); + y0 = fGETUHALF(0, RtV); + y1 = fGETUHALF(1, RtV); + + prod0 = prod1 = 0; + for(i=0; i < 16; i++) { + if((y0 >> i) & 1) prod0 ^= (x0 << i); + if((y1 >> i) & 1) prod1 ^= (x1 << i); + } + fSETHALF(0,RddV,fGETUHALF(0,prod0)); + fSETHALF(1,RddV,fGETUHALF(0,prod1)); + fSETHALF(2,RddV,fGETUHALF(1,prod0)); + fSETHALF(3,RddV,fGETUHALF(1,prod1)); +}) + +Q6INSN(M4_pmpyw_acc,"Rxx32^=pmpyw(Rs32,Rt32)",ATTRIBS(),"Polynomial 32bit Multiplication with Addition in GF(2)", +{ + fHIDE(int i; unsigned int y;) + fHIDE(unsigned long long x; unsigned long long prod;) + x = fGETUWORD(0, RsV); + y = fGETUWORD(0, RtV); + + prod = 0; + for(i=0; i < 32; i++) { + if((y >> i) & 1) prod ^= (x << i); + } + RxxV ^= prod; +}) + +Q6INSN(M4_vpmpyh_acc,"Rxx32^=vpmpyh(Rs32,Rt32)",ATTRIBS(),"Dual Polynomial 16bit Multiplication with Addition in GF(2)", +{ + fHIDE(int i; unsigned int x0; unsigned int x1;) + fHIDE(unsigned int y0; unsigned int y1;) + fHIDE(unsigned int prod0; unsigned int prod1;) + + x0 = fGETUHALF(0, RsV); + x1 = fGETUHALF(1, RsV); + y0 = fGETUHALF(0, RtV); + y1 = fGETUHALF(1, RtV); + + prod0 = prod1 = 0; + for(i=0; i < 16; i++) { + if((y0 >> i) & 1) prod0 ^= (x0 << i); + if((y1 >> i) & 1) prod1 ^= (x1 << i); + } + fSETHALF(0,RxxV,fGETUHALF(0,RxxV) ^ fGETUHALF(0,prod0)); + fSETHALF(1,RxxV,fGETUHALF(1,RxxV) ^ fGETUHALF(0,prod1)); + fSETHALF(2,RxxV,fGETUHALF(2,RxxV) ^ fGETUHALF(1,prod0)); + fSETHALF(3,RxxV,fGETUHALF(3,RxxV) ^ fGETUHALF(1,prod1)); +}) + + +/* V70: TINY CORE */ + +#define CMPY64(TAG,NAME,DESC,OPERAND1,OP,W0,W1,W2,W3) \ +Q6INSN(M7_##TAG,"Rdd32=" NAME "(Rss32," OPERAND1 ")",ATTRIBS(A_RESTRICT_SLOT3ONLY,A_RESTRICT_NOSLOT2_MPY),"Complex Multiply 64-bit " DESC, { RddV = (fMPY32SS(fGETWORD(W0, RssV), fGETWORD(W1, RttV)) OP fMPY32SS(fGETWORD(W2, RssV), fGETWORD(W3, RttV))); fEXTENSION_AUDIO();})\ +Q6INSN(M7_##TAG##_acc,"Rxx32+=" NAME "(Rss32,"OPERAND1")",ATTRIBS(A_RESTRICT_SLOT3ONLY,A_RESTRICT_NOSLOT2_MPY,A_NOTE_NOSLOT2_MPY),"Complex Multiply-Accumulate 64-bit " DESC, { RxxV += (fMPY32SS(fGETWORD(W0, RssV), fGETWORD(W1, RttV)) OP fMPY32SS(fGETWORD(W2, RssV), fGETWORD(W3, RttV))); fEXTENSION_AUDIO(); fACC(); }) + +CMPY64(dcmpyrw, "cmpyrw","Real","Rtt32" ,-,0,0,1,1) +CMPY64(dcmpyrwc,"cmpyrw","Real","Rtt32*",+,0,0,1,1) +CMPY64(dcmpyiw, "cmpyiw","Imag","Rtt32" ,+,0,1,1,0) +CMPY64(dcmpyiwc,"cmpyiw","Imag","Rtt32*",-,1,0,0,1) + +DEF_MAPPING(M7_vdmpy,"Rdd32=vdmpyw(Rss32,Rtt32)","Rdd32=cmpyrw(Rss32,Rtt32*)") +DEF_MAPPING(M7_vdmpy_acc,"Rxx32+=vdmpyw(Rss32,Rtt32)","Rxx32+=cmpyrw(Rss32,Rtt32*)") + + + +#define CMPY128(TAG, NAME, OPERAND1, WORD0, WORD1, WORD2, WORD3, OP) \ +Q6INSN(M7_##TAG,"Rd32=" NAME "(Rss32,"OPERAND1"):<<1:sat",ATTRIBS(A_RESTRICT_SLOT3ONLY,A_RESTRICT_NOSLOT2_MPY,A_NOTE_NOSLOT2_MPY),"Complex Multiply 32-bit result real", \ +{ \ +fHIDE(size16s_t acc128;)\ +fHIDE(size16s_t tmp128;)\ +fHIDE(size8s_t acc64;)\ +tmp128 = fCAST8S_16S(fMPY32SS(fGETWORD(WORD0, RssV), fGETWORD(WORD1, RttV)));\ +acc128 = fCAST8S_16S(fMPY32SS(fGETWORD(WORD2, RssV), fGETWORD(WORD3, RttV)));\ +acc128 = OP(tmp128,acc128);\ +acc128 = fSHIFTR128(acc128, 31);\ +acc64 = fCAST16S_8S(acc128);\ +RdV = fSATW(acc64);\ +fEXTENSION_AUDIO();}) + + +CMPY128(wcmpyrw, "cmpyrw", "Rtt32", 0, 0, 1, 1, fSUB128) +CMPY128(wcmpyrwc, "cmpyrw", "Rtt32*", 0, 0, 1, 1, fADD128) +CMPY128(wcmpyiw, "cmpyiw", "Rtt32", 0, 1, 1, 0, fADD128) +CMPY128(wcmpyiwc, "cmpyiw", "Rtt32*", 1, 0, 0, 1, fSUB128) + + +#define CMPY128RND(TAG, NAME, OPERAND1, WORD0, WORD1, WORD2, WORD3, OP) \ +Q6INSN(M7_##TAG##_rnd,"Rd32=" NAME "(Rss32,"OPERAND1"):<<1:rnd:sat",ATTRIBS(A_RESTRICT_SLOT3ONLY,A_RESTRICT_NOSLOT2_MPY,A_NOTE_NOSLOT2_MPY),"Complex Multiply 32-bit result real", \ +{ \ +fHIDE(size16s_t acc128;)\ +fHIDE(size16s_t tmp128;)\ +fHIDE(size16s_t const128;)\ +fHIDE(size8s_t acc64;)\ +tmp128 = fCAST8S_16S(fMPY32SS(fGETWORD(WORD0, RssV), fGETWORD(WORD1, RttV)));\ +acc128 = fCAST8S_16S(fMPY32SS(fGETWORD(WORD2, RssV), fGETWORD(WORD3, RttV)));\ +const128 = fCAST8S_16S(fCONSTLL(0x40000000));\ +acc128 = OP(tmp128,acc128);\ +acc128 = fADD128(acc128,const128);\ +acc128 = fSHIFTR128(acc128, 31);\ +acc64 = fCAST16S_8S(acc128);\ +RdV = fSATW(acc64);\ +fEXTENSION_AUDIO();}) + +CMPY128RND(wcmpyrw, "cmpyrw", "Rtt32", 0, 0, 1, 1, fSUB128) +CMPY128RND(wcmpyrwc, "cmpyrw", "Rtt32*", 0, 0, 1, 1, fADD128) +CMPY128RND(wcmpyiw, "cmpyiw", "Rtt32", 0, 1, 1, 0, fADD128) +CMPY128RND(wcmpyiwc, "cmpyiw", "Rtt32*", 1, 0, 0, 1, fSUB128) + + + + diff --git a/target/hexagon/imported/shift.idef b/target/hexagon/imported/shift.idef new file mode 100644 index 0000000..4217bcf --- /dev/null +++ b/target/hexagon/imported/shift.idef @@ -0,0 +1,1211 @@ +/* + * Copyright (c) 2019 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/* + * S-type Instructions + */ + +/**********************************************/ +/* SHIFTS */ +/**********************************************/ + +/* NOTE: Rdd = Rs *right* shifts don't make sense */ +/* NOTE: Rd[d] = Rs[s] *right* shifts with saturation don't make sense */ + + +#define RSHIFTTYPES(TAGEND,REGD,REGS,REGSTYPE,ACC,ACCSRC,SAT,SATOPT,ATTRS) \ +Q6INSN(S2_asr_r_##TAGEND,#REGD "32" #ACC "=asr(" #REGS "32,Rt32)" #SATOPT,ATTRIBS(ATTRS), \ + "Arithmetic Shift Right by Register", \ + { \ + fHIDE(size4s_t) shamt=fSXTN(7,32,RtV);\ + REGD##V = SAT(ACCSRC ACC fBIDIR_ASHIFTR(REGS##V,shamt,REGSTYPE)); \ + })\ +\ +Q6INSN(S2_asl_r_##TAGEND,#REGD "32" #ACC "=asl(" #REGS "32,Rt32)" #SATOPT,ATTRIBS(ATTRS), \ + "Arithmetic Shift Left by Register", \ + { \ + fHIDE(size4s_t) shamt=fSXTN(7,32,RtV);\ + REGD##V = SAT(ACCSRC ACC fBIDIR_ASHIFTL(REGS##V,shamt,REGSTYPE)); \ + })\ +\ +Q6INSN(S2_lsr_r_##TAGEND,#REGD "32" #ACC "=lsr(" #REGS "32,Rt32)" #SATOPT,ATTRIBS(ATTRS), \ + "Logical Shift Right by Register", \ + { \ + fHIDE(size4s_t) shamt=fSXTN(7,32,RtV);\ + REGD##V = SAT(ACCSRC ACC fBIDIR_LSHIFTR(REGS##V,shamt,REGSTYPE)); \ + })\ +\ +Q6INSN(S2_lsl_r_##TAGEND,#REGD "32" #ACC "=lsl(" #REGS "32,Rt32)" #SATOPT,ATTRIBS(ATTRS), \ + "Logical Shift Left by Register", \ + { \ + fHIDE(size4s_t) shamt=fSXTN(7,32,RtV);\ + REGD##V = SAT(ACCSRC ACC fBIDIR_LSHIFTL(REGS##V,shamt,REGSTYPE)); \ + }) + +RSHIFTTYPES(r,Rd,Rs,4_8,,,fECHO,,) +RSHIFTTYPES(p,Rdd,Rss,8_8,,,fECHO,,) +RSHIFTTYPES(r_acc,Rx,Rs,4_8,+,RxV,fECHO,,A_ROPS_2) +RSHIFTTYPES(p_acc,Rxx,Rss,8_8,+,RxxV,fECHO,,A_ROPS_2) +RSHIFTTYPES(r_nac,Rx,Rs,4_8,-,RxV,fECHO,,A_ROPS_2) +RSHIFTTYPES(p_nac,Rxx,Rss,8_8,-,RxxV,fECHO,,A_ROPS_2) + +RSHIFTTYPES(r_and,Rx,Rs,4_8,&,RxV,fECHO,,A_ROPS_2) +RSHIFTTYPES(r_or,Rx,Rs,4_8,|,RxV,fECHO,,A_ROPS_2) +RSHIFTTYPES(p_and,Rxx,Rss,8_8,&,RxxV,fECHO,,A_ROPS_2) +RSHIFTTYPES(p_or,Rxx,Rss,8_8,|,RxxV,fECHO,,A_ROPS_2) +RSHIFTTYPES(p_xor,Rxx,Rss,8_8,^,RxxV,fECHO,,A_ROPS_2) + + +#undef RSHIFTTYPES + +/* Register shift with saturation */ +#define RSATSHIFTTYPES(TAGEND,REGD,REGS,REGSTYPE) \ +Q6INSN(S2_asr_r_##TAGEND,#REGD "32" "=asr(" #REGS "32,Rt32):sat",ATTRIBS(), \ + "Arithmetic Shift Right by Register", \ + { \ + fHIDE(size4s_t) shamt=fSXTN(7,32,RtV);\ + REGD##V = fBIDIR_ASHIFTR_SAT(REGS##V,shamt,REGSTYPE); \ + })\ +\ +Q6INSN(S2_asl_r_##TAGEND,#REGD "32" "=asl(" #REGS "32,Rt32):sat",ATTRIBS(), \ + "Arithmetic Shift Left by Register", \ + { \ + fHIDE(size4s_t) shamt=fSXTN(7,32,RtV);\ + REGD##V = fBIDIR_ASHIFTL_SAT(REGS##V,shamt,REGSTYPE); \ + }) + +RSATSHIFTTYPES(r_sat,Rd,Rs,4_8) + + + + + +#define ISHIFTTYPES(TAGEND,SIZE,REGD,REGS,REGSTYPE,ACC,ACCSRC,SAT,SATOPT,ATTRS) \ +Q6INSN(S2_asr_i_##TAGEND,#REGD "32" #ACC "=asr(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(ATTRS), \ + "Arithmetic Shift Right by Immediate", \ + { REGD##V = SAT(ACCSRC ACC fASHIFTR(REGS##V,uiV,REGSTYPE)); }) \ +\ +Q6INSN(S2_lsr_i_##TAGEND,#REGD "32" #ACC "=lsr(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(ATTRS), \ + "Logical Shift Right by Immediate", \ + { REGD##V = SAT(ACCSRC ACC fLSHIFTR(REGS##V,uiV,REGSTYPE)); }) \ +\ +Q6INSN(S2_asl_i_##TAGEND,#REGD "32" #ACC "=asl(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(ATTRS), \ + "Shift Left by Immediate", \ + { REGD##V = SAT(ACCSRC ACC fASHIFTL(REGS##V,uiV,REGSTYPE)); }) \ +Q6INSN(S6_rol_i_##TAGEND,#REGD "32" #ACC "=rol(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(ATTRS), \ + "Rotate Left by Immediate", \ + { REGD##V = SAT(ACCSRC ACC fROTL(REGS##V,uiV,REGSTYPE)); }) + + +#define ISHIFTTYPES_ONLY_ASL(TAGEND,SIZE,REGD,REGS,REGSTYPE,ACC,ACCSRC,SAT,SATOPT) \ +Q6INSN(S2_asl_i_##TAGEND,#REGD "32" #ACC "=asl(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(), \ + "", \ + { REGD##V = SAT(ACCSRC ACC fASHIFTL(REGS##V,uiV,REGSTYPE)); }) + +#define ISHIFTTYPES_ONLY_ASR(TAGEND,SIZE,REGD,REGS,REGSTYPE,ACC,ACCSRC,SAT,SATOPT) \ +Q6INSN(S2_asr_i_##TAGEND,#REGD "32" #ACC "=asr(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(), \ + "", \ + { REGD##V = SAT(ACCSRC ACC fASHIFTR(REGS##V,uiV,REGSTYPE)); }) + + +#define ISHIFTTYPES_NOASR(TAGEND,SIZE,REGD,REGS,REGSTYPE,ACC,ACCSRC,SAT,SATOPT) \ +Q6INSN(S2_lsr_i_##TAGEND,#REGD "32" #ACC "=lsr(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(), \ + "Logical Shift Right by Register", \ + { REGD##V = SAT(ACCSRC ACC fLSHIFTR(REGS##V,uiV,REGSTYPE)); }) \ +Q6INSN(S2_asl_i_##TAGEND,#REGD "32" #ACC "=asl(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(), \ + "Shift Left by Register", \ + { REGD##V = SAT(ACCSRC ACC fASHIFTL(REGS##V,uiV,REGSTYPE)); }) \ +Q6INSN(S6_rol_i_##TAGEND,#REGD "32" #ACC "=rol(" #REGS "32,#u" #SIZE ")" #SATOPT,ATTRIBS(), \ + "Rotate Left by Immediate", \ + { REGD##V = SAT(ACCSRC ACC fROTL(REGS##V,uiV,REGSTYPE)); }) + + + +ISHIFTTYPES(r,5,Rd,Rs,4_4,,,fECHO,,) +ISHIFTTYPES(p,6,Rdd,Rss,8_8,,,fECHO,,) +ISHIFTTYPES(r_acc,5,Rx,Rs,4_4,+,RxV,fECHO,,A_ROPS_2) +ISHIFTTYPES(p_acc,6,Rxx,Rss,8_8,+,RxxV,fECHO,,A_ROPS_2) +ISHIFTTYPES(r_nac,5,Rx,Rs,4_4,-,RxV,fECHO,,A_ROPS_2) +ISHIFTTYPES(p_nac,6,Rxx,Rss,8_8,-,RxxV,fECHO,,A_ROPS_2) + +ISHIFTTYPES_NOASR(r_xacc,5,Rx,Rs,4_4,^, RxV,fECHO,) +ISHIFTTYPES_NOASR(p_xacc,6,Rxx,Rss,8_8,^, RxxV,fECHO,) + +ISHIFTTYPES(r_and,5,Rx,Rs,4_4,&,RxV,fECHO,,A_ROPS_2) +ISHIFTTYPES(r_or,5,Rx,Rs,4_4,|,RxV,fECHO,,A_ROPS_2) +ISHIFTTYPES(p_and,6,Rxx,Rss,8_8,&,RxxV,fECHO,,A_ROPS_2) +ISHIFTTYPES(p_or,6,Rxx,Rss,8_8,|,RxxV,fECHO,,A_ROPS_2) + +ISHIFTTYPES_ONLY_ASL(r_sat,5,Rd,Rs,4_8,,,fSAT,:sat) + + +Q6INSN(S2_asr_i_r_rnd,"Rd32=asr(Rs32,#u5):rnd",ATTRIBS(), + "Shift right with round", + { RdV = fASHIFTR(((fASHIFTR(RsV,uiV,4_8))+1),1,8_8); }) + + +DEF_V2_COND_MAPPING(S2_asr_i_r_rnd_goodsyntax,"Rd32=asrrnd(Rs32,#u5)","#u5==0","Rd32=Rs32","Rd32=asr(Rs32,#u5-1):rnd") + + + + +Q6INSN(S2_asr_i_p_rnd,"Rdd32=asr(Rss32,#u6):rnd",ATTRIBS(), "Shift right with round", +{ fHIDE(size8u_t tmp;) + fHIDE(size8u_t rnd;) + tmp = fASHIFTR(RssV,uiV,8_8); + rnd = tmp & 1; + RddV = fASHIFTR(tmp,1,8_8) + rnd; }) + + +DEF_V5_COND_MAPPING(S2_asr_i_p_rnd_goodsyntax,"Rdd32=asrrnd(Rss32,#u6)","#u6==0","Rdd32=Rss32","Rdd32=asr(Rss32,#u6-1):rnd") + + + +Q6INSN(S4_lsli,"Rd32=lsl(#s6,Rt32)",ATTRIBS(), "Shift an immediate left by register amount", +{ + fHIDE(size4s_t) shamt = fSXTN(7,32,RtV); + RdV = fBIDIR_LSHIFTL(siV,shamt,4_8); +}) + + + + +Q6INSN(S2_addasl_rrri,"Rd32=addasl(Rt32,Rs32,#u3)",ATTRIBS(), + "Shift left by small amount and add", + { RdV = RtV + fASHIFTL(RsV,uiV,4_4); }) + + + +#define SHIFTOPI(TAGEND,INNEROP,INNERSEM)\ +Q6INSN(S4_andi_##TAGEND,"Rx32=and(#u8,"INNEROP")",,"Shift-op",{RxV=fIMMEXT(uiV)&INNERSEM;})\ +Q6INSN(S4_ori_##TAGEND, "Rx32=or(#u8,"INNEROP")",,"Shift-op",{RxV=fIMMEXT(uiV)|INNERSEM;})\ +Q6INSN(S4_addi_##TAGEND,"Rx32=add(#u8,"INNEROP")",,"Shift-op",{RxV=fIMMEXT(uiV)+INNERSEM;})\ +Q6INSN(S4_subi_##TAGEND,"Rx32=sub(#u8,"INNEROP")",,"Shift-op",{RxV=fIMMEXT(uiV)-INNERSEM;}) + + +SHIFTOPI(asl_ri,"asl(Rx32,#U5)",(RxV<>UiV)) + + +/**********************************************/ +/* PERMUTES */ +/**********************************************/ +Q6INSN(S2_valignib,"Rdd32=valignb(Rtt32,Rss32,#u3)", +ATTRIBS(), "Vector align bytes", +{ + RddV = (fLSHIFTR(RssV,uiV*8,8_8))|(fASHIFTL(RttV,((8-uiV)*8),8_8)); +}) + +Q6INSN(S2_valignrb,"Rdd32=valignb(Rtt32,Rss32,Pu4)", +ATTRIBS(), "Align with register", +{ fPREDUSE_TIMING();RddV = fLSHIFTR(RssV,(PuV&0x7)*8,8_8)|(fASHIFTL(RttV,(8-(PuV&0x7))*8,8_8));}) + +Q6INSN(S2_vspliceib,"Rdd32=vspliceb(Rss32,Rtt32,#u3)", +ATTRIBS(), "Vector splice bytes", +{ RddV = fASHIFTL(RttV,uiV*8,8_8) | fZXTN(uiV*8,64,RssV); }) + +Q6INSN(S2_vsplicerb,"Rdd32=vspliceb(Rss32,Rtt32,Pu4)", +ATTRIBS(), "Splice with register", +{ fPREDUSE_TIMING();RddV = fASHIFTL(RttV,(PuV&7)*8,8_8) | fZXTN((PuV&7)*8,64,RssV); }) + +Q6INSN(S2_vsplatrh,"Rdd32=vsplath(Rs32)", +ATTRIBS(), "Vector splat halfwords from register", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV, fGETHALF(0,RsV)); + } +}) + + +Q6INSN(S2_vsplatrb,"Rd32=vsplatb(Rs32)", +ATTRIBS(), "Vector splat bytes from register", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETBYTE(i,RdV, fGETBYTE(0,RsV)); + } +}) + +Q6INSN(S6_vsplatrbp,"Rdd32=vsplatb(Rs32)", +ATTRIBS(), "Vector splat bytes from register", +{ + fHIDE(int i;) + for (i=0;i<8;i++) { + fSETBYTE(i,RddV, fGETBYTE(0,RsV)); + } +}) + + + +/**********************************************/ +/* Insert/Extract[u] */ +/**********************************************/ + +Q6INSN(S2_insert,"Rx32=insert(Rs32,#u5,#U5)", +ATTRIBS(), "Insert bits", +{ + fHIDE(int) width=uiV; + fHIDE(int) offset=UiV; + /* clear bits in Rxx where new bits go */ + RxV &= ~(((fCONSTLL(1)<>uiV)); + fSETWORD(0,RddV,fZXTN(uiV,32,RsV)); +}) + +Q6INSN(A4_bitsplit,"Rdd32=bitsplit(Rs32,Rt32)", +ATTRIBS(), "Split a bitfield into two registers", +{ + fHIDE(size4u_t) shamt = fZXTN(5,32,RtV); + fSETWORD(1,RddV,(fCAST4_4u(RsV)>>shamt)); + fSETWORD(0,RddV,fZXTN(shamt,32,RsV)); +}) + + + + +Q6INSN(S4_extract,"Rd32=extract(Rs32,#u5,#U5)", +ATTRIBS(), "Extract signed bitfield", +{ + fHIDE(int) width=uiV; + fHIDE(int) offset=UiV; + RdV = fSXTN(width,32,(fCAST4_4u(RsV) >> offset)); +}) + + +Q6INSN(S2_extractu,"Rd32=extractu(Rs32,#u5,#U5)", +ATTRIBS(), "Extract unsigned bitfield", +{ + fHIDE(int) width=uiV; + fHIDE(int) offset=UiV; + RdV = fZXTN(width,32,(fCAST4_4u(RsV) >> offset)); +}) + +Q6INSN(S2_insertp,"Rxx32=insert(Rss32,#u6,#U6)", +ATTRIBS(), "Insert bits", +{ + fHIDE(int) width=uiV; + fHIDE(int) offset=UiV; + /* clear bits in Rxx where new bits go */ + RxxV &= ~(((fCONSTLL(1)<> offset)); +}) + + +Q6INSN(S2_extractup,"Rdd32=extractu(Rss32,#u6,#U6)", +ATTRIBS(), "Extract unsigned bitfield", +{ + fHIDE(int) width=uiV; + fHIDE(int) offset=UiV; + RddV = fZXTN(width,64,(fCAST8_8u(RssV) >> offset)); +}) + + + + +Q6INSN(S2_mask,"Rd32=mask(#u5,#U5)", +ATTRIBS(), "Form mask from immediate", +{ + RdV = ((1<>uiV)); + } +}) + + +Q6INSN(S2_lsr_i_vh,"Rdd32=vlsrh(Rss32,#u4)",ATTRIBS(), + "Vector Logical Shift Right by Immediate", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV, (fGETUHALF(i,RssV)>>uiV)); + } +}) + +Q6INSN(S2_asl_i_vh,"Rdd32=vaslh(Rss32,#u4)",ATTRIBS(), + "Vector Arithmetic Shift Left by Immediate", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV, (fGETHALF(i,RssV)<> uiV )+1)>>1 )); + } +}) + +DEF_V5_COND_MAPPING(S5_asrhub_rnd_sat_goodsyntax,"Rd32=vasrhub(Rss32,#u4):rnd:sat","#u4==0","Rd32=vsathub(Rss32)","Rd32=vasrhub(Rss32,#u4-1):raw") + +Q6INSN(S5_asrhub_sat,"Rd32=vasrhub(Rss32,#u4):sat",, + "Vector Arithmetic Shift Right by Immediate with Saturate and Pack", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETBYTE(i,RdV, fSATUB( fGETHALF(i,RssV) >> uiV )); + } +}) + + + +Q6INSN(S5_vasrhrnd,"Rdd32=vasrh(Rss32,#u4):raw",, + "Vector Arithmetic Shift Right by Immediate with Round", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV, ( ((fGETHALF(i,RssV) >> uiV)+1)>>1 )); + } +}) + +DEF_V5_COND_MAPPING(S5_vasrhrnd_goodsyntax,"Rdd32=vasrh(Rss32,#u4):rnd","#u4==0","Rdd32=Rss32","Rdd32=vasrh(Rss32,#u4-1):raw") + + + +Q6INSN(S2_asl_r_vh,"Rdd32=vaslh(Rss32,Rt32)",ATTRIBS(A_NOTE_OOBVSHIFT), + "Vector Arithmetic Shift Left by Register", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV, fBIDIR_ASHIFTL(fGETHALF(i,RssV),fSXTN(7,32,RtV),2_8)); + } +}) + + + +Q6INSN(S2_lsr_r_vh,"Rdd32=vlsrh(Rss32,Rt32)",ATTRIBS(A_NOTE_OOBVSHIFT), + "Vector Logical Shift Right by Register", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV, fBIDIR_LSHIFTR(fGETUHALF(i,RssV),fSXTN(7,32,RtV),2_8)); + } +}) + + +Q6INSN(S2_lsl_r_vh,"Rdd32=vlslh(Rss32,Rt32)",ATTRIBS(A_NOTE_OOBVSHIFT), + "Vector Logical Shift Left by Register", +{ + fHIDE(int i;) + for (i=0;i<4;i++) { + fSETHALF(i,RddV, fBIDIR_LSHIFTL(fGETUHALF(i,RssV),fSXTN(7,32,RtV),2_8)); + } +}) + + + + +/* Word Vector Immediate Shifts */ + +Q6INSN(S2_asr_i_vw,"Rdd32=vasrw(Rss32,#u5)",ATTRIBS(), + "Vector Arithmetic Shift Right by Immediate", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,(fGETWORD(i,RssV)>>uiV)); + } +}) + + + +Q6INSN(S2_asr_i_svw_trun,"Rd32=vasrw(Rss32,#u5)",ATTRIBS(A_ARCHV2), + "Vector Arithmetic Shift Right by Immediate with Truncate and Pack", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETHALF(i,RdV,fGETHALF(0,(fGETWORD(i,RssV)>>uiV))); + } +}) + +Q6INSN(S2_asr_r_svw_trun,"Rd32=vasrw(Rss32,Rt32)",ATTRIBS(A_ARCHV2), + "Vector Arithmetic Shift Right truncate and Pack", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETHALF(i,RdV,fGETHALF(0,fBIDIR_ASHIFTR(fGETWORD(i,RssV),fSXTN(7,32,RtV),4_8))); + } +}) + + +Q6INSN(S2_lsr_i_vw,"Rdd32=vlsrw(Rss32,#u5)",ATTRIBS(), + "Vector Logical Shift Right by Immediate", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,(fGETUWORD(i,RssV)>>uiV)); + } +}) + +Q6INSN(S2_asl_i_vw,"Rdd32=vaslw(Rss32,#u5)",ATTRIBS(), + "Vector Arithmetic Shift Left by Immediate", +{ + fHIDE(int i;) + for (i=0;i<2;i++) { + fSETWORD(i,RddV,(fGETWORD(i,RssV)<> 1) | (fCAST8u((1&fCOUNTONES_8(RssV & RttV)))<<63) ; }) + +Q6INSN(S2_clbnorm,"Rd32=normamt(Rs32)",ATTRIBS(A_ARCHV2), +"Count leading sign bits - 1", { if (RsV == 0) { RdV = 0; } else { RdV = (fMAX(fCL1_4(RsV),fCL1_4(~RsV)))-1;} }) + +Q6INSN(S4_clbaddi,"Rd32=add(clb(Rs32),#s6)",ATTRIBS(A_ARCHV2), +"Count leading sign bits then add signed number", +{ RdV = (fMAX(fCL1_4(RsV),fCL1_4(~RsV)))+siV;} ) + +Q6INSN(S4_clbpnorm,"Rd32=normamt(Rss32)",ATTRIBS(A_ARCHV2), +"Count leading sign bits - 1", { if (RssV == 0) { RdV = 0; } +else { RdV = (fMAX(fCL1_8(RssV),fCL1_8(~RssV)))-1;}}) + +Q6INSN(S4_clbpaddi,"Rd32=add(clb(Rss32),#s6)",ATTRIBS(A_ARCHV2), +"Count leading sign bits then add signed number", +{ RdV = (fMAX(fCL1_8(RssV),fCL1_8(~RssV)))+siV;}) + + + +Q6INSN(S2_cabacdecbin,"Rdd32=decbin(Rss32,Rtt32)",ATTRIBS(A_ARCHV3,A_NOTE_LATEPRED,A_RESTRICT_LATEPRED),"CABAC decode bin", +{ + fHIDE(size4u_t state;) + fHIDE(size4u_t valMPS;) + fHIDE(size4u_t bitpos;) + fHIDE(size4u_t range;) + fHIDE(size4u_t offset;) + fHIDE(size4u_t rLPS;) + fHIDE(size4u_t rMPS;) + + state = fEXTRACTU_RANGE( fGETWORD(1,RttV) ,5,0); + valMPS = fEXTRACTU_RANGE( fGETWORD(1,RttV) ,8,8); + bitpos = fEXTRACTU_RANGE( fGETWORD(0,RttV) ,4,0); + range = fGETWORD(0,RssV); + offset = fGETWORD(1,RssV); + + /* calculate rLPS */ + range <<= bitpos; + offset <<= bitpos; + rLPS = rLPS_table_64x4[state][ (range >>29)&3]; + rLPS = rLPS << 23; /* left aligned */ + + /* calculate rMPS */ + rMPS= (range&0xff800000) - rLPS; + + /* most probable region */ + if (offset < rMPS) { + RddV = AC_next_state_MPS_64[state]; + fINSERT_RANGE(RddV,8,8,valMPS); + fINSERT_RANGE(RddV,31,23,(rMPS>>23)); + fSETWORD(1,RddV,offset); + fWRITE_P0(valMPS); + + + } + /* least probable region */ + else { + RddV = AC_next_state_LPS_64[state]; + fINSERT_RANGE(RddV,8,8,((!state)?(1-valMPS):(valMPS))); + fINSERT_RANGE(RddV,31,23,(rLPS>>23)); + fSETWORD(1,RddV,(offset-rMPS)); + fWRITE_P0((valMPS^1)); + } + fHIDE(MARK_LATE_PRED_WRITE(0)) +}) + + + + +Q6INSN(S2_cabacencbin,"Rdd32=encbin(Rss32,Rtt32,Pu4)",ATTRIBS(A_FAKEINSN),"CABAC encode bin", +{ + fPREDUSE_TIMING(); + fHIDE(size4u_t state;) + fHIDE(size4u_t valMPS;) + fHIDE(size4u_t bitpos;) + fHIDE(size4u_t range;) + fHIDE(size4u_t low;) + fHIDE(size4u_t rLPS;) + fHIDE(size4u_t rMPS;) + fHIDE(size4u_t bin;) + + state = fEXTRACTU_RANGE( fGETWORD(1,RttV) ,5,0); + valMPS = fEXTRACTU_RANGE( fGETWORD(1,RttV) ,8,8); + bitpos = fEXTRACTU_RANGE( fGETWORD(0,RttV) ,4,0); + range = fGETWORD(0,RssV); + low = fGETWORD(1,RssV); + bin = fLSBOLD(PuV); + + /* Strip of MSB zeros */ + range <<= bitpos; + /* Mask off low bits */ + range &= 0xFF800000U; + + /* calculate rLPS */ + rLPS = rLPS_table_64x4[state][(range>>29)&3]; + rLPS = rLPS << 23; /* left aligned */ + + /* Calculate rMPS */ + rMPS = range - rLPS; + + /* most probable region */ + if (bin == valMPS) { + RddV = AC_next_state_MPS_64[state]; + fINSERT_RANGE(RddV,8,8,valMPS); + fINSERT_RANGE(RddV,31,23,(rMPS>>23)); + fSETWORD(1,RddV,low); + } + /* least probable region */ + else { + RddV = AC_next_state_LPS_64[state]; + fINSERT_RANGE(RddV,8,8,((!state)?(1-valMPS):(valMPS))); + fINSERT_RANGE(RddV,31,23,(rLPS>>23)); + fSETWORD(1,RddV,(low+(rMPS >> bitpos))); + } +}) + + + + +Q6INSN(S2_clb,"Rd32=clb(Rs32)",ATTRIBS(), +"Count leading bits", {RdV = fMAX(fCL1_4(RsV),fCL1_4(~RsV));}) + + +Q6INSN(S2_cl0,"Rd32=cl0(Rs32)",ATTRIBS(), +"Count leading bits", {RdV = fCL1_4(~RsV);}) + +Q6INSN(S2_cl1,"Rd32=cl1(Rs32)",ATTRIBS(), +"Count leading bits", {RdV = fCL1_4(RsV);}) + +Q6INSN(S2_clbp,"Rd32=clb(Rss32)",ATTRIBS(), +"Count leading bits", {RdV = fMAX(fCL1_8(RssV),fCL1_8(~RssV));}) + +Q6INSN(S2_cl0p,"Rd32=cl0(Rss32)",ATTRIBS(), +"Count leading bits", {RdV = fCL1_8(~RssV);}) + +Q6INSN(S2_cl1p,"Rd32=cl1(Rss32)",ATTRIBS(), +"Count leading bits", {RdV = fCL1_8(RssV);}) + + + + +Q6INSN(S2_brev, "Rd32=brev(Rs32)", ATTRIBS(A_ARCHV2), "Bit Reverse",{RdV = fBREV_4(RsV);}) +Q6INSN(S2_brevp,"Rdd32=brev(Rss32)", ATTRIBS(), "Bit Reverse",{RddV = fBREV_8(RssV);}) +Q6INSN(S2_ct0, "Rd32=ct0(Rs32)", ATTRIBS(A_ARCHV2), "Count Trailing",{RdV = fCL1_4(~fBREV_4(RsV));}) +Q6INSN(S2_ct1, "Rd32=ct1(Rs32)", ATTRIBS(A_ARCHV2), "Count Trailing",{RdV = fCL1_4(fBREV_4(RsV));}) +Q6INSN(S2_ct0p, "Rd32=ct0(Rss32)", ATTRIBS(), "Count Trailing",{RdV = fCL1_8(~fBREV_8(RssV));}) +Q6INSN(S2_ct1p, "Rd32=ct1(Rss32)", ATTRIBS(), "Count Trailing",{RdV = fCL1_8(fBREV_8(RssV));}) + + +Q6INSN(S2_interleave,"Rdd32=interleave(Rss32)",ATTRIBS(A_ARCHV2),"Interleave bits", +{RddV = fINTERLEAVE(fGETWORD(1,RssV),fGETWORD(0,RssV));}) + +Q6INSN(S2_deinterleave,"Rdd32=deinterleave(Rss32)",ATTRIBS(A_ARCHV2),"Interleave bits", +{RddV = fDEINTERLEAVE(RssV);}) + + + + +Q6INSN(S6_userinsn,"Rxx32=userinsn(Rss32,Rtt32,#u7)",ATTRIBS(A_FAKEINSN), +"User Defined Instruction", +{ + if (thread->processor_ptr->options->userinsn_callback) { + RxxV = thread->processor_ptr->options->userinsn_callback( + thread->system_ptr,thread->processor_ptr, + thread->threadId,RssV,RttV,RxxV,uiV); + } else { + warn("User instruction encountered but no callback!"); + } + +}) + + + + + + + diff --git a/target/hexagon/imported/subinsns.idef b/target/hexagon/imported/subinsns.idef new file mode 100644 index 0000000..d1cd81f --- /dev/null +++ b/target/hexagon/imported/subinsns.idef @@ -0,0 +1,152 @@ +/* + * Copyright (c) 2019 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/* + * sub-instructions + */ + + + +/*****************************************************************/ +/* */ +/* A-type subinsns */ +/* */ +/*****************************************************************/ + +Q6INSN(SA1_addi, "Rx16=add(Rx16,#s7)", ATTRIBS(A_SUBINSN),"Add", { fIMMEXT(siV); RxV=RxV+siV;}) +Q6INSN(SA1_tfr, "Rd16=Rs16", ATTRIBS(A_SUBINSN),"Tfr", { RdV=RsV;}) +Q6INSN(SA1_seti, "Rd16=#u6", ATTRIBS(A_SUBINSN),"Set immed", { fIMMEXT(uiV); RdV=uiV;}) +Q6INSN(SA1_setin1, "Rd16=#-1", ATTRIBS(A_SUBINSN),"Set to -1", { RdV=-1;}) +Q6INSN(SA1_clrtnew, "if (p0.new) Rd16=#0", ATTRIBS(A_SUBINSN),"clear if true", { if (fLSBNEW0) {RdV=0;} else {CANCEL;} }) +Q6INSN(SA1_clrfnew, "if (!p0.new) Rd16=#0", ATTRIBS(A_SUBINSN),"clear if false",{ if (fLSBNEW0NOT) {RdV=0;} else {CANCEL;} }) +Q6INSN(SA1_clrt, "if (p0) Rd16=#0", ATTRIBS(A_SUBINSN),"clear if true", { if (fLSBOLD(fREAD_P0())) {RdV=0;} else {CANCEL;} }) +Q6INSN(SA1_clrf, "if (!p0) Rd16=#0", ATTRIBS(A_SUBINSN),"clear if false",{ if (fLSBOLDNOT(fREAD_P0())) {RdV=0;} else {CANCEL;} }) + +Q6INSN(SA1_addsp, "Rd16=add(r29,#u6:2)", ATTRIBS(A_SUBINSN),"Add", { RdV=fREAD_SP()+uiV; }) +Q6INSN(SA1_inc, "Rd16=add(Rs16,#1)", ATTRIBS(A_SUBINSN),"Inc", { RdV=RsV+1;}) +Q6INSN(SA1_dec, "Rd16=add(Rs16,#-1)", ATTRIBS(A_SUBINSN),"Dec", { RdV=RsV-1;}) +Q6INSN(SA1_addrx, "Rx16=add(Rx16,Rs16)", ATTRIBS(A_SUBINSN,A_COMMUTES),"Add", { RxV=RxV+RsV; }) +Q6INSN(SA1_zxtb, "Rd16=and(Rs16,#255)", ATTRIBS(A_SUBINSN),"Zxtb", { RdV= fZXTN(8,32,RsV);}) +Q6INSN(SA1_and1, "Rd16=and(Rs16,#1)", ATTRIBS(A_SUBINSN),"And #1", { RdV= RsV&1;}) +Q6INSN(SA1_sxtb, "Rd16=sxtb(Rs16)", ATTRIBS(A_SUBINSN),"Sxtb", { RdV= fSXTN(8,32,RsV);}) +Q6INSN(SA1_zxth, "Rd16=zxth(Rs16)", ATTRIBS(A_SUBINSN),"Zxth", { RdV= fZXTN(16,32,RsV);}) +Q6INSN(SA1_sxth, "Rd16=sxth(Rs16)", ATTRIBS(A_SUBINSN),"Sxth", { RdV= fSXTN(16,32,RsV);}) +Q6INSN(SA1_combinezr,"Rdd8=combine(#0,Rs16)", ATTRIBS(A_SUBINSN,A_ROPS_2),"Combines", { fSETWORD(0,RddV,RsV); fSETWORD(1,RddV,0); }) +Q6INSN(SA1_combinerz,"Rdd8=combine(Rs16,#0)", ATTRIBS(A_SUBINSN,A_ROPS_2),"Combines", { fSETWORD(0,RddV,0); fSETWORD(1,RddV,RsV); }) +Q6INSN(SA1_combine0i,"Rdd8=combine(#0,#u2)", ATTRIBS(A_SUBINSN,A_ROPS_2),"Combines", { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,0); }) +Q6INSN(SA1_combine1i,"Rdd8=combine(#1,#u2)", ATTRIBS(A_SUBINSN,A_ROPS_2),"Combines", { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,1); }) +Q6INSN(SA1_combine2i,"Rdd8=combine(#2,#u2)", ATTRIBS(A_SUBINSN,A_ROPS_2),"Combines", { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,2); }) +Q6INSN(SA1_combine3i,"Rdd8=combine(#3,#u2)", ATTRIBS(A_SUBINSN,A_ROPS_2),"Combines", { fSETWORD(0,RddV,uiV); fSETWORD(1,RddV,3); }) +Q6INSN(SA1_cmpeqi, "p0=cmp.eq(Rs16,#u2)", ATTRIBS(A_SUBINSN),"CompareImmed",{fWRITE_P0(f8BITSOF(RsV==uiV));}) + + + + +/*****************************************************************/ +/* */ +/* Ld1/2 subinsns */ +/* */ +/*****************************************************************/ + +Q6INSN(SL1_loadri_io, "Rd16=memw(Rs16+#u4:2)", ATTRIBS(A_MEMSIZE_4B,A_LOAD,A_SUBINSN),"load word", {fEA_RI(RsV,uiV); fLOAD(1,4,u,EA,RdV);}) +Q6INSN(SL1_loadrub_io, "Rd16=memub(Rs16+#u4:0)",ATTRIBS(A_MEMSIZE_1B,A_LOAD,A_SUBINSN),"load byte", {fEA_RI(RsV,uiV); fLOAD(1,1,u,EA,RdV);}) + +Q6INSN(SL2_loadrh_io, "Rd16=memh(Rs16+#u3:1)", ATTRIBS(A_MEMSIZE_2B,A_LOAD,A_SUBINSN),"load half", {fEA_RI(RsV,uiV); fLOAD(1,2,s,EA,RdV);}) +Q6INSN(SL2_loadruh_io, "Rd16=memuh(Rs16+#u3:1)",ATTRIBS(A_MEMSIZE_2B,A_LOAD,A_SUBINSN),"load half", {fEA_RI(RsV,uiV); fLOAD(1,2,u,EA,RdV);}) +Q6INSN(SL2_loadrb_io, "Rd16=memb(Rs16+#u3:0)", ATTRIBS(A_MEMSIZE_1B,A_LOAD,A_SUBINSN),"load byte", {fEA_RI(RsV,uiV); fLOAD(1,1,s,EA,RdV);}) +Q6INSN(SL2_loadri_sp, "Rd16=memw(r29+#u5:2)", ATTRIBS(A_MEMSIZE_4B,A_LOAD,A_SUBINSN),"load word", {fEA_RI(fREAD_SP(),uiV); fLOAD(1,4,u,EA,RdV);}) +Q6INSN(SL2_loadrd_sp, "Rdd8=memd(r29+#u5:3)", ATTRIBS(A_MEMSIZE_8B,A_LOAD,A_SUBINSN),"load dword",{fEA_RI(fREAD_SP(),uiV); fLOAD(1,8,u,EA,RddV);}) + +Q6INSN(SL2_deallocframe,"deallocframe", ATTRIBS(A_SUBINSN,A_MEMSIZE_8B,A_LOAD,A_DEALLOCFRAME), "Deallocate stack frame", +{ fHIDE(size8u_t tmp;) fEA_REG(fREAD_FP()); + fLOAD(1,8,u,EA,tmp); + tmp = fFRAME_UNSCRAMBLE(tmp); + fWRITE_LR(fGETWORD(1,tmp)); + fWRITE_FP(fGETWORD(0,tmp)); + fWRITE_SP(EA+8); }) + +Q6INSN(SL2_return,"dealloc_return", ATTRIBS(A_JINDIR,A_SUBINSN,A_ROPS_2,A_MEMSIZE_8B,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY,A_RET_TYPE,A_DEALLOCRET), "Deallocate stack frame and return", +{ fHIDE(size8u_t tmp;) fEA_REG(fREAD_FP()); + fLOAD(1,8,u,EA,tmp); + tmp = fFRAME_UNSCRAMBLE(tmp); + fWRITE_LR(fGETWORD(1,tmp)); + fWRITE_FP(fGETWORD(0,tmp)); + fWRITE_SP(EA+8); + fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);}) + +Q6INSN(SL2_return_t,"if (p0) dealloc_return", ATTRIBS(A_JINDIROLD,A_SUBINSN,A_ROPS_2,A_MEMSIZE_8B,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY,A_RET_TYPE,A_PRED_BIT_4), "Deallocate stack frame and return", +{ fHIDE(size8u_t tmp;); fBRANCH_SPECULATE_STALL(fLSBOLD(fREAD_P0()),, SPECULATE_NOT_TAKEN,4,0); fEA_REG(fREAD_FP()); if (fLSBOLD(fREAD_P0())) { fLOAD(1,8,u,EA,tmp); tmp = fFRAME_UNSCRAMBLE(tmp); fWRITE_LR(fGETWORD(1,tmp)); fWRITE_FP(fGETWORD(0,tmp)); fWRITE_SP(EA+8); + fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);} else {LOAD_CANCEL(EA);} }) + +Q6INSN(SL2_return_f,"if (!p0) dealloc_return", ATTRIBS(A_JINDIROLD,A_SUBINSN,A_ROPS_2,A_MEMSIZE_8B,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY,A_RET_TYPE,A_PRED_BIT_4), "Deallocate stack frame and return", +{ fHIDE(size8u_t tmp;);fBRANCH_SPECULATE_STALL(fLSBOLDNOT(fREAD_P0()),, SPECULATE_NOT_TAKEN,4,0); fEA_REG(fREAD_FP()); if (fLSBOLDNOT(fREAD_P0())) { fLOAD(1,8,u,EA,tmp); tmp = fFRAME_UNSCRAMBLE(tmp); fWRITE_LR(fGETWORD(1,tmp)); fWRITE_FP(fGETWORD(0,tmp)); fWRITE_SP(EA+8); + fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);} else {LOAD_CANCEL(EA);} }) + + + +Q6INSN(SL2_return_tnew,"if (p0.new) dealloc_return:nt", ATTRIBS(A_JINDIRNEW,A_SUBINSN,A_ROPS_2,A_MEMSIZE_8B,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY,A_RET_TYPE,A_PRED_BIT_4), "Deallocate stack frame and return", +{ fHIDE(size8u_t tmp;) fBRANCH_SPECULATE_STALL(fLSBNEW0,, SPECULATE_NOT_TAKEN , 4,3); fEA_REG(fREAD_FP()); if (fLSBNEW0) { fLOAD(1,8,u,EA,tmp); tmp = fFRAME_UNSCRAMBLE(tmp); fWRITE_LR(fGETWORD(1,tmp)); fWRITE_FP(fGETWORD(0,tmp)); fWRITE_SP(EA+8); + fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);} else {LOAD_CANCEL(EA);} }) + +Q6INSN(SL2_return_fnew,"if (!p0.new) dealloc_return:nt", ATTRIBS(A_JINDIRNEW,A_SUBINSN,A_ROPS_2,A_MEMSIZE_8B,A_LOAD,A_RETURN,A_RESTRICT_SLOT0ONLY,A_RET_TYPE,A_PRED_BIT_4), "Deallocate stack frame and return", +{ fHIDE(size8u_t tmp;) fBRANCH_SPECULATE_STALL(fLSBNEW0NOT,, SPECULATE_NOT_TAKEN , 4,3); fEA_REG(fREAD_FP()); if (fLSBNEW0NOT) { fLOAD(1,8,u,EA,tmp); tmp = fFRAME_UNSCRAMBLE(tmp); fWRITE_LR(fGETWORD(1,tmp)); fWRITE_FP(fGETWORD(0,tmp)); fWRITE_SP(EA+8); + fJUMPR(REG_LR,fGETWORD(1,tmp),COF_TYPE_JUMPR);} else {LOAD_CANCEL(EA);} }) + + +Q6INSN(SL2_jumpr31,"jumpr r31",ATTRIBS(A_SUBINSN,A_JINDIR,A_RESTRICT_SLOT0ONLY,A_RET_TYPE),"indirect unconditional jump", +{ fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}) + +Q6INSN(SL2_jumpr31_t,"if (p0) jumpr r31",ATTRIBS(A_SUBINSN,A_JINDIROLD,A_NOTE_CONDITIONAL,A_RESTRICT_SLOT0ONLY,A_RET_TYPE,A_PRED_BIT_4),"indirect conditional jump if true", +{fBRANCH_SPECULATE_STALL(fLSBOLD(fREAD_P0()),, SPECULATE_TAKEN,4,0); if (fLSBOLD(fREAD_P0())) {fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}}) + +Q6INSN(SL2_jumpr31_f,"if (!p0) jumpr r31",ATTRIBS(A_SUBINSN,A_JINDIROLD,A_NOTE_CONDITIONAL,A_RESTRICT_SLOT0ONLY,A_RET_TYPE,A_PRED_BIT_4),"indirect conditional jump if false", +{fBRANCH_SPECULATE_STALL(fLSBOLDNOT(fREAD_P0()),, SPECULATE_TAKEN,4,0); if (fLSBOLDNOT(fREAD_P0())) {fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}}) + + + +Q6INSN(SL2_jumpr31_tnew,"if (p0.new) jumpr:nt r31",ATTRIBS(A_SUBINSN,A_JINDIRNEW,A_NOTE_CONDITIONAL,A_RESTRICT_SLOT0ONLY,A_RET_TYPE,A_PRED_BIT_4),"indirect conditional jump if true", +{fBRANCH_SPECULATE_STALL(fLSBNEW0,, SPECULATE_NOT_TAKEN , 4,3); if (fLSBNEW0) {fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}}) + +Q6INSN(SL2_jumpr31_fnew,"if (!p0.new) jumpr:nt r31",ATTRIBS(A_SUBINSN,A_JINDIRNEW,A_NOTE_CONDITIONAL,A_RESTRICT_SLOT0ONLY,A_RET_TYPE,A_PRED_BIT_4),"indirect conditional jump if false", +{fBRANCH_SPECULATE_STALL(fLSBNEW0NOT,, SPECULATE_NOT_TAKEN , 4,3); if (fLSBNEW0NOT) {fJUMPR(REG_LR,fREAD_LR(),COF_TYPE_JUMPR);}}) + + + + + +/*****************************************************************/ +/* */ +/* St1/2 subinsns */ +/* */ +/*****************************************************************/ + +Q6INSN(SS1_storew_io, "memw(Rs16+#u4:2)=Rt16", ATTRIBS(A_MEMSIZE_4B,A_STORE,A_SUBINSN), "store word", {fEA_RI(RsV,uiV); fSTORE(1,4,EA,RtV);}) +Q6INSN(SS1_storeb_io, "memb(Rs16+#u4:0)=Rt16", ATTRIBS(A_MEMSIZE_1B,A_STORE,A_SUBINSN), "store byte", {fEA_RI(RsV,uiV); fSTORE(1,1,EA,fGETBYTE(0,RtV));}) +Q6INSN(SS2_storeh_io, "memh(Rs16+#u3:1)=Rt16", ATTRIBS(A_MEMSIZE_2B,A_STORE,A_SUBINSN), "store half", {fEA_RI(RsV,uiV); fSTORE(1,2,EA,fGETHALF(0,RtV));}) +Q6INSN(SS2_stored_sp, "memd(r29+#s6:3)=Rtt8", ATTRIBS(A_MEMSIZE_8B,A_STORE,A_SUBINSN), "store dword",{fEA_RI(fREAD_SP(),siV); fSTORE(1,8,EA,RttV);}) +Q6INSN(SS2_storew_sp, "memw(r29+#u5:2)=Rt16", ATTRIBS(A_MEMSIZE_4B,A_STORE,A_SUBINSN), "store word", {fEA_RI(fREAD_SP(),uiV); fSTORE(1,4,EA,RtV);}) +Q6INSN(SS2_storewi0, "memw(Rs16+#u4:2)=#0", ATTRIBS(A_MEMSIZE_4B,A_STORE,A_SUBINSN,A_ROPS_2), "store word", {fEA_RI(RsV,uiV); fSTORE(1,4,EA,0);}) +Q6INSN(SS2_storebi0, "memb(Rs16+#u4:0)=#0", ATTRIBS(A_MEMSIZE_1B,A_STORE,A_SUBINSN,A_ROPS_2), "store byte", {fEA_RI(RsV,uiV); fSTORE(1,1,EA,0);}) +Q6INSN(SS2_storewi1, "memw(Rs16+#u4:2)=#1", ATTRIBS(A_MEMSIZE_4B,A_STORE,A_SUBINSN,A_ROPS_2), "store word", {fEA_RI(RsV,uiV); fSTORE(1,4,EA,1);}) +Q6INSN(SS2_storebi1, "memb(Rs16+#u4:0)=#1", ATTRIBS(A_MEMSIZE_1B,A_STORE,A_SUBINSN,A_ROPS_2), "store byte", {fEA_RI(RsV,uiV); fSTORE(1,1,EA,1);}) + + +Q6INSN(SS2_allocframe,"allocframe(#u5:3)", ATTRIBS(A_SUBINSN,A_MEMSIZE_8B,A_STORE,A_RESTRICT_SLOT0ONLY), "Allocate stack frame", +{ fEA_RI(fREAD_SP(),-8); fSTORE(1,8,EA,fFRAME_SCRAMBLE((fCAST8_8u(fREAD_LR()) << 32) | fCAST4_4u(fREAD_FP()))); fWRITE_FP(EA); fFRAMECHECK(EA-uiV,EA); fWRITE_SP(EA-uiV); }) + + + diff --git a/target/hexagon/imported/system.idef b/target/hexagon/imported/system.idef new file mode 100644 index 0000000..8dbf52f --- /dev/null +++ b/target/hexagon/imported/system.idef @@ -0,0 +1,302 @@ +/* + * Copyright (c) 2019 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/* + * System Interface Instructions + */ + + + +/********************************************/ +/* User->OS interface */ +/********************************************/ + +Q6INSN(J2_trap0,"trap0(#u8)",ATTRIBS(A_COF,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET), +"Trap to Operating System", + fTRAP(0,uiV); +) + +Q6INSN(J2_trap1,"trap1(Rx32,#u8)",ATTRIBS(A_COF,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET), +"Trap to Operating System", + /* + * Note: if RxV is not written, we get the same as the input. + * Since trap1 is SOLO, this means the register will effectively not be updated + */ + if (!fTRAP1_VIRTINSN(uiV)) { + fTRAP(1,uiV); + } else if (uiV == 1) { + fVIRTINSN_RTE(uiV,RxV); + } else if (uiV == 3) { + fVIRTINSN_SETIE(uiV,RxV); + } else if (uiV == 4) { + fVIRTINSN_GETIE(uiV,RxV); + } else if (uiV == 6) { + fVIRTINSN_SPSWAP(uiV,RxV); + }) + +DEF_MAPPING(J2_trap1_noregmap,"trap1(#u8)","trap1(R0,#u8)") + +Q6INSN(J2_pause,"pause(#u8)",ATTRIBS(A_COF,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET), +"Enter low-power state for #u8 cycles",{fPAUSE(uiV);}) + +Q6INSN(J2_rte, "rte", ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NO_TIMING_LOG), +"Return from Exception", +{ +fHIDE(if((thread->timing_on) && (thread->status & EXEC_STATUS_REPLAY)) { return; }) +fHIDE(CALLBACK(thread->processor_ptr->options->rte_callback, + thread->system_ptr,thread->processor_ptr, + thread->threadId,0);) +fCLEAR_RTE_EX(); +fBRANCH(fREAD_ELR(),COF_TYPE_RTE);}) + + +/********************************************/ +/* Interrupt Management */ +/********************************************/ + +Q6INSN(Y2_swi,"swi(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK),"Software Interrupt",{DO_SWI(RsV);}) +Q6INSN(Y2_cswi,"cswi(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK),"Cancel Software Interrupt",{DO_CSWI(RsV);}) +Q6INSN(Y2_ciad,"ciad(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK),"Re-enable interrupt in IAD",{DO_CIAD(RsV);}) +Q6INSN(Y4_siad,"siad(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK),"Disable interrupt in IAD",{DO_SIAD(RsV);}) +Q6INSN(Y2_iassignr,"Rd32=iassignr(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK),"Read interrupt to thread assignments",{DO_IASSIGNR(RsV,RdV);}) +Q6INSN(Y2_iassignw,"iassignw(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK),"Write interrupt to thread assignments",{DO_IASSIGNW(RsV);}) + + +Q6INSN(Y2_getimask,"Rd32=getimask(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK),"Read imask register of another thread", +{RdV = READ_IMASK(RsV & thread->processor_ptr->thread_system_mask); }) + +Q6INSN(Y2_setimask,"setimask(Pt4,Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK),"Change imask register of another thread", +{fPREDUSE_TIMING();WRITE_IMASK(PtV & thread->processor_ptr->thread_system_mask,RsV); }) + + + +/********************************************/ +/* TLB management */ +/********************************************/ + +Q6INSN(Y2_tlbw,"tlbw(Rss32,Rt32)", ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET), +"Write TLB entry", {fTLBW(RtV,RssV);}) + +Q6INSN(Y5_ctlbw,"Rd32=ctlbw(Rss32,Rt32)", ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET), +"Conditional Write TLB entry", +{ + if (fTLB_ENTRY_OVERLAP( (1LL<<63) | RssV )) { + RdV=fTLB_ENTRY_OVERLAP_IDX( (1LL<<63) | RssV); + } else { + fTLBW(RtV,RssV); + RdV=0x80000000; + } +}) + +Q6INSN(Y5_tlboc,"Rd32=tlboc(Rss32)", ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET), +"TLB overlap check", +{ + if (fTLB_ENTRY_OVERLAP( (1LL<<63) | RssV )) { + RdV=fTLB_ENTRY_OVERLAP_IDX( (1LL<<63) | RssV); + } else { + RdV=0x80000000; + } +}) + +Q6INSN(Y2_tlbr,"Rdd32=tlbr(Rs32)", ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET), "Read TLB entry", +{RddV = fTLBR(RsV);}) + +Q6INSN(Y2_tlbp,"Rd32=tlbp(Rs32)", ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET), "Probe TLB", {RdV=fTLBP(RsV);}) + +Q6INSN(Y5_tlbasidi,"tlbinvasid(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET), "Invalidate ASID", +{ + fHIDE(int i;) + fHIDE(unsigned int NUM_TLB_ENTRIES = NUM_TLB_REGS(thread->processor_ptr);) + for (i = 0; i < NUM_TLB_ENTRIES; i++) { + if ((fGET_FIELD(fTLBR(i),PTE_G) == 0) && + (fGET_FIELD(fTLBR(i),PTE_ASID) == fEXTRACTU_RANGE(RsV,26,20))) { + fTLBW(i,fTLBR(i) & ~(1ULL << 63)); + } + } +}) + +Q6INSN(Y2_tlblock,"tlblock", ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_NO_TIMING_LOG), "Lock TLB", +{fSET_TLB_LOCK()}) + +Q6INSN(Y2_tlbunlock,"tlbunlock", ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET), "Unlock TLB", +{fCLEAR_TLB_LOCK();}) + +DEF_MAPPING(Y2_k1lock_map,"k1lock","tlblock") +DEF_MAPPING(Y2_k1unlock_map,"k1unlock","tlbunlock") + +Q6INSN(Y2_k0lock,"k0lock", ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_NO_TIMING_LOG), "Lock K0", +{fSET_K0_LOCK()}) + +Q6INSN(Y2_k0unlock,"k0unlock", ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET), "Unlock K0", +{fCLEAR_K0_LOCK();}) + +/********************************************/ +/* Supervisor Reg Management */ +/********************************************/ + +DEF_V4_MAPPING(Y2_crswap_old,"crswap(Rx32,sgp)","crswap(Rx32,sgp0)") + +Q6INSN(Y2_crswap0,"crswap(Rx32,sgp0)",ATTRIBS(A_PRIV,A_NOTE_PRIV), "Swap system general pointer 0 with GPR", +{fHIDE(size4s_t tmp;) tmp = RxV; RxV = READ_SGP0(); WRITE_SGP0(tmp);}) +Q6INSN(Y4_crswap1,"crswap(Rx32,sgp1)",ATTRIBS(A_PRIV,A_NOTE_PRIV), "Swap system general pointer 1 with GPR", +{fHIDE(size4s_t tmp;) tmp = RxV; RxV = READ_SGP1(); WRITE_SGP1(tmp);}) + +Q6INSN(Y4_crswap10,"crswap(Rxx32,sgp1:0)",ATTRIBS(A_PRIV,A_NOTE_PRIV), "Swap system general purpose 0/1 with GPR Pair", +{fHIDE(size8s_t tmp;) tmp = RxxV; RxxV=READ_SGP10(); WRITE_SGP10(tmp);}) + +Q6INSN(Y2_tfrscrr,"Rd32=Ss128",ATTRIBS(A_PRIV,A_NOTE_PRIV),"Transfer Supervisor Reg to GPR", {RdV=SsV;}) +Q6INSN(Y2_tfrsrcr,"Sd128=Rs32",ATTRIBS(A_PRIV,A_NOTE_PRIV),"Transfer GPR to Supervisor Reg", {SdV=RsV;}) +Q6INSN(Y4_tfrscpp,"Rdd32=Sss128",ATTRIBS(A_PRIV,A_NOTE_PRIV),"Transfer Supervisor Reg to GPR", {RddV=SssV;}) +Q6INSN(Y4_tfrspcp,"Sdd128=Rss32",ATTRIBS(A_PRIV,A_NOTE_PRIV),"Transfer GPR to Supervisor Reg", {SddV=RssV;}) + +#ifdef PTWALK +Q6INSN(Y6_tfracrr,"Rd32=As64",ATTRIBS(A_PRIV,A_NOTE_PRIV),"Transfer Address Reg to GPR", {RdV=AsV;}) +Q6INSN(Y6_tfrarcr,"Ad64=Rs32",ATTRIBS(A_PRIV,A_NOTE_PRIV),"Transfer GPR to Address Reg", {AdV=RsV;}) +Q6INSN(Y6_tfracpp,"Rdd32=Ass64",ATTRIBS(A_PRIV,A_NOTE_PRIV),"Transfer Address Reg to GPR", {RddV=AssV;}) +Q6INSN(Y6_tfrapcp,"Add64=Rss32",ATTRIBS(A_PRIV,A_NOTE_PRIV),"Transfer GPR to Address Reg", {AddV=RssV;}) +#endif + +Q6INSN(G4_tfrgcrr,"Rd32=Gs32",ATTRIBS(A_GUEST,A_NOTE_GUEST),"Transfer Guest Reg to GPR", {RdV=GsV;}) +Q6INSN(G4_tfrgrcr,"Gd32=Rs32",ATTRIBS(A_GUEST,A_NOTE_GUEST),"Transfer GPR to Guest Reg", {GdV=RsV;}) +Q6INSN(G4_tfrgcpp,"Rdd32=Gss32",ATTRIBS(A_GUEST,A_NOTE_GUEST),"Transfer Guest Reg to GPR", {RddV=GssV;}) +Q6INSN(G4_tfrgpcp,"Gdd32=Rss32",ATTRIBS(A_GUEST,A_NOTE_GUEST),"Transfer GPR to Guest Reg", {GddV=RssV;}) + + + +Q6INSN(Y2_setprio,"setprio(Pt4,Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV),"Change TID Prio of another thread", +{fPREDUSE_TIMING();WRITE_PRIO(PtV & thread->processor_ptr->thread_system_mask,RsV); }) + + + + +/********************************************/ +/* Power Management / Thread on/off */ +/********************************************/ +Q6INSN(Y6_diag,"diag(Rs32)",ATTRIBS(),"Send value to Diag trace module",{ +}) +Q6INSN(Y6_diag0,"diag0(Rss32,Rtt32)",ATTRIBS(),"Send values of two register to DIAG Trace. Set X=0",{ +}) +Q6INSN(Y6_diag1,"diag1(Rss32,Rtt32)",ATTRIBS(),"Send values of two register to DIAG Trace. Set X=1",{ +}) + + +Q6INSN(Y4_trace,"trace(Rs32)",ATTRIBS(A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK),"Send value to ETM trace",{ + fDO_TRACE(RsV); +}) + +Q6INSN(Y2_stop,"stop(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET),"Stop thread(s)",{ + fHIDE(RsV=RsV;) + if (!fIN_DEBUG_MODE_NO_ISDB(fGET_TNUM())) fCLEAR_RUN_MODE(fGET_TNUM()); +}) + +Q6INSN(Y4_nmi,"nmi(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_NO_TIMING_LOG),"Raise NMI on thread(s)",{ + fDO_NMI(RsV); +}) + +Q6INSN(Y2_start,"start(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET),"Start thread(s)",fSTART(RsV);) + +Q6INSN(Y2_wait,"wait(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_NO_TIMING_LOG),"Make thread(s) wait",{ + fHIDE(RsV=RsV;) + if (!fIN_DEBUG_MODE(fGET_TNUM())) fSET_WAIT_MODE(fGET_TNUM()); + fIN_DEBUG_MODE_WARN(fGET_TNUM()); +}) + +Q6INSN(Y2_resume,"resume(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET),"Make thread(s) stop waiting",fRESUME(RsV);) + +Q6INSN(Y2_break,"brkpt",ATTRIBS(A_NOTE_NOPACKET,A_RESTRICT_NOPACKET),"Breakpoint",{fBREAK();}) + + +/********************************************/ +/* Cache Management */ +/********************************************/ + +Q6INSN(Y2_ictagr,"Rd32=ictagr(Rs32)",ATTRIBS(A_ICOP,A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_CACHEOP,A_COPBYIDX,A_ICTAGOP),"Instruction Cache Tag Read",{fICTAGR(RsV,RdV,RdN);}) +Q6INSN(Y2_ictagw,"ictagw(Rs32,Rt32)",ATTRIBS(A_ICOP,A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_CACHEOP,A_COPBYIDX,A_ICTAGOP),"Instruction Cache Tag Write",{fICTAGW(RsV,RtV);}) +Q6INSN(Y2_icdataw,"icdataw(Rs32,Rt32)",ATTRIBS(A_ICOP,A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_CACHEOP,A_COPBYIDX,A_ICTAGOP),"Instruction Cache Data Write",{fICDATAW(RsV,RtV);}) +Q6INSN(Y2_icdatar,"Rd32=icdatar(Rs32)",ATTRIBS(A_ICOP,A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_CACHEOP,A_COPBYIDX,A_ICTAGOP),"Instruction Cache Data Read",{fICDATAR(RsV, RdV);}) +Q6INSN(Y2_icinva,"icinva(Rs32)",ATTRIBS(A_ICOP,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_CACHEOP,A_COPBYADDRESS,A_ICFLUSHOP),"Instruction Cache Invalidate Address",{fEA_REG(RsV); fICINVA(EA);}) +Q6INSN(Y2_icinvidx,"icinvidx(Rs32)",ATTRIBS(A_ICOP,A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_CACHEOP,A_COPBYIDX,A_ICFLUSHOP),"Instruction Cache Invalidate Index",{fICINVIDX(RsV);}) +Q6INSN(Y2_ickill,"ickill",ATTRIBS(A_ICOP,A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_CACHEOP,A_ICFLUSHOP),"Instruction Cache Invalidate",{fICKILL();}) + +Q6INSN(Y2_isync,"isync",ATTRIBS(A_NOTE_NOPACKET,A_RESTRICT_NOPACKET),"Memory Synchronization",{fISYNC();}) +Q6INSN(Y2_barrier,"barrier",ATTRIBS(A_NOTE_NOPACKET,A_RESTRICT_SLOT0ONLY,A_RESTRICT_PACKET_AXOK),"Memory Barrier",{fBARRIER();}) +Q6INSN(Y2_syncht,"syncht",ATTRIBS(A_NOTE_NOPACKET,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOPACKET),"Memory Synchronization",{fSYNCH();}) + + +Q6INSN(Y2_dcfetchbo,"dcfetch(Rs32+#u11:3)",ATTRIBS(A_RESTRICT_PREFERSLOT0,A_DCFETCH,A_RESTRICT_NOSLOT1_STORE),"Data Cache Prefetch",{fEA_RI(RsV,uiV); fDCFETCH(EA);}) +DEF_MAPPING(Y2_dcfetch, "dcfetch(Rs32)","dcfetch(Rs32+#0)") +Q6INSN(Y2_dckill,"dckill",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOPACKET,A_CACHEOP,A_DCFLUSHOP),"Data Cache Invalidate",{fDCKILL();}) + + +Q6INSN(Y2_dczeroa,"dczeroa(Rs32)",ATTRIBS(A_STORE,A_RESTRICT_SLOT1_AOK,A_NOTE_SLOT1_AOK,A_RESTRICT_SLOT0ONLY,A_CACHEOP,A_COPBYADDRESS,A_DCZEROA),"Zero an aligned 32-byte cacheline",{fEA_REG(RsV); fDCZEROA(EA);}) +Q6INSN(Y2_dccleana,"dccleana(Rs32)",ATTRIBS(A_RESTRICT_SLOT1_AOK,A_NOTE_SLOT1_AOK,A_RESTRICT_SLOT0ONLY,A_CACHEOP,A_COPBYADDRESS,A_DCFLUSHOP),"Data Cache Clean Address",{fEA_REG(RsV); fDCCLEANA(EA);}) +Q6INSN(Y2_dccleanidx,"dccleanidx(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_RESTRICT_PACKET_AXOK,A_NOTE_AXOK,A_RESTRICT_SLOT0ONLY,A_CACHEOP,A_COPBYIDX,A_DCFLUSHOP),"Data Cache Clean Index",{fDCCLEANIDX(RsV);}) +Q6INSN(Y2_dccleaninva,"dccleaninva(Rs32)",ATTRIBS(A_RESTRICT_SLOT1_AOK,A_NOTE_SLOT1_AOK,A_RESTRICT_SLOT0ONLY,A_CACHEOP,A_COPBYADDRESS,A_DCFLUSHOP),"Data Cache Clean and Invalidate Address",{fEA_REG(RsV); fDCCLEANINVA(EA);}) +Q6INSN(Y2_dccleaninvidx,"dccleaninvidx(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_RESTRICT_PACKET_AXOK,A_NOTE_AXOK,A_RESTRICT_SLOT0ONLY,A_CACHEOP,A_COPBYIDX,A_DCFLUSHOP),"Data Cache Clean and Invalidate Index",{fDCCLEANINVIDX(RsV);}) +Q6INSN(Y2_dcinva,"dcinva(Rs32)",ATTRIBS(A_RESTRICT_SLOT1_AOK,A_NOTE_SLOT1_AOK,A_RESTRICT_SLOT0ONLY,A_CACHEOP,A_COPBYADDRESS,A_DCFLUSHOP),"Data Cache Invalidate Address",{fEA_REG(RsV); fDCCLEANINVA(EA);}) +Q6INSN(Y2_dcinvidx,"dcinvidx(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_RESTRICT_PACKET_AXOK,A_NOTE_AXOK,A_RESTRICT_SLOT0ONLY,A_CACHEOP,A_COPBYIDX,A_DCFLUSHOP),"Data Cache Invalidate Index",{fDCINVIDX(RsV);}) +Q6INSN(Y2_dctagr,"Rd32=dctagr(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_RESTRICT_PACKET_AXOK,A_NOTE_AXOK,A_RESTRICT_SLOT0ONLY,A_CACHEOP,A_COPBYIDX,A_DCTAGOP),"Data Cache Tag Read",{fDCTAGR(RsV,RdV,RdN);}) +Q6INSN(Y2_dctagw,"dctagw(Rs32,Rt32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_RESTRICT_SLOT0ONLY,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_CACHEOP,A_COPBYIDX,A_DCTAGOP),"Data Cache Tag Write",{fDCTAGW(RsV,RtV);}) + + +Q6INSN(Y2_l2kill,"l2kill",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOPACKET,A_CACHEOP,A_L2FLUSHOP),"L2 Cache Invalidate",{fL2KILL();}) +Q6INSN(Y4_l2tagw,"l2tagw(Rs32,Rt32)",ATTRIBS(A_PRIV,A_NOTE_BADTAG_UNDEF,A_NOTE_PRIV,A_RESTRICT_SLOT0ONLY,A_NOTE_NOPACKET,A_RESTRICT_NOPACKET,A_CACHEOP,A_COPBYIDX,A_L2TAGOP),"L2 Cache Tag Write",{fL2TAGW(RsV,RtV);}) +Q6INSN(Y4_l2tagr,"Rd32=l2tagr(Rs32)",ATTRIBS(A_PRIV,A_NOTE_BADTAG_UNDEF,A_NOTE_PRIV,A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK,A_RESTRICT_SLOT0ONLY,A_CACHEOP,A_COPBYIDX,A_L2TAGOP),"L2 Cache Tag Read",{fL2TAGR(RsV,RdV,RdN);}) + +Q6INSN(Y2_l2cleaninvidx,"l2cleaninvidx(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK,A_RESTRICT_SLOT0ONLY,A_CACHEOP,A_COPBYIDX,A_L2FLUSHOP),"L2 Cache Clean and Invalidate Index",{fL2CLEANINVIDX(RsV); }) +Q6INSN(Y5_l2cleanidx,"l2cleanidx(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK,A_RESTRICT_SLOT0ONLY,A_CACHEOP,A_COPBYIDX,A_L2FLUSHOP),"L2 Cache Clean by Index",{fL2CLEANIDX(RsV); }) +Q6INSN(Y5_l2invidx,"l2invidx(Rs32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_AXOK,A_RESTRICT_PACKET_AXOK,A_RESTRICT_SLOT0ONLY,A_CACHEOP,A_COPBYIDX,A_L2FLUSHOP),"L2 Cache Invalidate by Index",{fL2INVIDX(RsV); }) + + + +Q6INSN(Y4_l2fetch,"l2fetch(Rs32,Rt32)",ATTRIBS(A_RESTRICT_SLOT0ONLY,A_RESTRICT_PACKET_AXOK,A_NOTE_AXOK),"L2 Cache Prefetch", +{ fL2FETCH(RsV, + (RtV&0xff), /*height*/ + ((RtV>>8)&0xff), /*width*/ + ((RtV>>16)&0xffff), /*stride*/ + 0); /*extra attrib flags*/ +}) + + + +Q6INSN(Y5_l2fetch,"l2fetch(Rs32,Rtt32)",ATTRIBS(A_RESTRICT_SLOT0ONLY,A_RESTRICT_PACKET_AXOK,A_NOTE_AXOK),"L2 Cache Prefetch", +{ fL2FETCH(RsV, + fGETUHALF(0,RttV), /*height*/ + fGETUHALF(1,RttV), /*width*/ + fGETUHALF(2,RttV), /*stride*/ + fGETUHALF(3,RttV)); /*flags*/ +}) + +Q6INSN(Y5_l2locka,"Pd4=l2locka(Rs32)", ATTRIBS(A_PRIV,A_NOTE_PRIV,A_CACHEOP,A_COPBYADDRESS,A_RESTRICT_SLOT0ONLY,A_RESTRICT_PACKET_AXOK,A_NOTE_AXOK,A_RESTRICT_LATEPRED,A_NOTE_LATEPRED), +"Lock L2 cache line by address", { fEA_REG(RsV); fL2LOCKA(EA,PdV,PdN); fHIDE(MARK_LATE_PRED_WRITE(PdN)) }); + + +Q6INSN(Y5_l2unlocka,"l2unlocka(Rs32)", ATTRIBS(A_PRIV,A_NOTE_PRIV,A_CACHEOP,A_COPBYADDRESS,A_RESTRICT_SLOT0ONLY,A_RESTRICT_PACKET_AXOK,A_NOTE_AXOK), "UnLock L2 cache line by address", { fEA_REG(RsV); fL2UNLOCKA(EA); }) + + + +Q6INSN(Y5_l2gunlock,"l2gunlock",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOPACKET,A_CACHEOP,A_L2FLUSHOP),"L2 Global Unlock",{fL2UNLOCK();}) + +Q6INSN(Y5_l2gclean,"l2gclean",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOPACKET,A_CACHEOP,A_L2FLUSHOP),"L2 Global Clean",{fL2CLEAN();}) + +Q6INSN(Y5_l2gcleaninv,"l2gcleaninv",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOPACKET,A_CACHEOP,A_L2FLUSHOP),"L2 Global Clean and Invalidate",{fL2CLEANINV();}) + +Q6INSN(Y6_l2gcleanpa,"l2gclean(Rtt32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOPACKET,A_CACHEOP,A_L2FLUSHOP),"L2 Global Clean by PA Range",{fL2CLEANPA(RttV);}) + +Q6INSN(Y6_l2gcleaninvpa,"l2gcleaninv(Rtt32)",ATTRIBS(A_PRIV,A_NOTE_PRIV,A_NOTE_NOPACKET,A_RESTRICT_SLOT0ONLY,A_RESTRICT_NOPACKET,A_CACHEOP,A_L2FLUSHOP),"L2 Global Clean and Invalidate by PA Range",{fL2CLEANINVPA(RttV);}) + +