From patchwork Wed Feb 15 06:37:16 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bharata B Rao X-Patchwork-Id: 9573419 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4FA68601D8 for ; Wed, 15 Feb 2017 06:38:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3E4F528417 for ; Wed, 15 Feb 2017 06:38:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2FC2B2842E; Wed, 15 Feb 2017 06:38:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 686F928417 for ; Wed, 15 Feb 2017 06:38:01 +0000 (UTC) Received: from localhost ([::1]:38771 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cdtDz-0000QA-OV for patchwork-qemu-devel@patchwork.kernel.org; Wed, 15 Feb 2017 01:37:59 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46423) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cdtDb-0000PR-7b for qemu-devel@nongnu.org; Wed, 15 Feb 2017 01:37:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cdtDY-0006Xn-3H for qemu-devel@nongnu.org; Wed, 15 Feb 2017 01:37:35 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:48115) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cdtDX-0006X9-PL for qemu-devel@nongnu.org; Wed, 15 Feb 2017 01:37:32 -0500 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v1F6XpoL082257 for ; Wed, 15 Feb 2017 01:37:28 -0500 Received: from e28smtp07.in.ibm.com (e28smtp07.in.ibm.com [125.16.236.7]) by mx0a-001b2d01.pphosted.com with ESMTP id 28m9wxpumr-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 15 Feb 2017 01:37:27 -0500 Received: from localhost by e28smtp07.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 15 Feb 2017 12:07:24 +0530 Received: from d28dlp01.in.ibm.com (9.184.220.126) by e28smtp07.in.ibm.com (192.168.1.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 15 Feb 2017 12:07:21 +0530 Received: from d28relay05.in.ibm.com (d28relay05.in.ibm.com [9.184.220.62]) by d28dlp01.in.ibm.com (Postfix) with ESMTP id C7D3AE0024; Wed, 15 Feb 2017 12:08:53 +0530 (IST) Received: from d28av02.in.ibm.com (d28av02.in.ibm.com [9.184.220.64]) by d28relay05.in.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v1F6bIsW33620202; Wed, 15 Feb 2017 12:07:18 +0530 Received: from d28av02.in.ibm.com (localhost [127.0.0.1]) by d28av02.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id v1F6bKsc012678; Wed, 15 Feb 2017 12:07:20 +0530 Received: from bharata.in.ibm.com ([9.124.35.54]) by d28av02.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id v1F6bKeH012675; Wed, 15 Feb 2017 12:07:20 +0530 From: Bharata B Rao To: qemu-devel@nongnu.org Date: Wed, 15 Feb 2017 12:07:16 +0530 X-Mailer: git-send-email 2.7.4 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 17021506-0024-0000-0000-0000039ED3C5 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17021506-0025-0000-0000-0000111910F7 Message-Id: <1487140636-19955-1-git-send-email-bharata@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-02-15_03:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1612050000 definitions=main-1702150066 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 148.163.156.1 Subject: [Qemu-devel] [PATCH] target-ppc: Add quad precision muladd instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rth@twiddle.net, qemu-ppc@nongnu.org, Bharata B Rao , nikunj@linux.vnet.ibm.com, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP xsmaddqp: VSX Scalar Multiply-Add Quad-Precision xsmaddqpo: VSX Scalar Multiply-Add Quad-Precision using round to Odd xsnmaddqp: VSX Scalar Negative Multiply-Add Quad-Precision xsnmaddqpo: VSX Scalar Negative Multiply-Add Quad-Precision using round to Odd xsmsubqp: VSX Scalar Multiply-Subtract Quad-Precision xsmsubqpo: VSX Scalar Multiply-Subtract Quad-Precision using round to Odd xsnmsubqp: VSX Scalar Negative Multiply-Subtract Quad-Precision xsnmsubqpo: VSX Scalar Negative Multiply-Subtract Quad-Precision using round to Odd Signed-off-by: Bharata B Rao --- target/ppc/fpu_helper.c | 69 +++++++++++++++++++++++++++++++++++++ target/ppc/helper.h | 4 +++ target/ppc/translate/vsx-impl.inc.c | 4 +++ target/ppc/translate/vsx-ops.inc.c | 4 +++ 4 files changed, 81 insertions(+) diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index 58aee64..201cafd 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -2425,6 +2425,75 @@ VSX_MADD(xvnmaddmsp, 4, float32, VsrW(i), NMADD_FLGS, 0, 0, 0) VSX_MADD(xvnmsubasp, 4, float32, VsrW(i), NMSUB_FLGS, 1, 0, 0) VSX_MADD(xvnmsubmsp, 4, float32, VsrW(i), NMSUB_FLGS, 0, 0, 0) +/* + * Quadruple-precision version of multiply and add/subtract. + * + * This implementation is not 100% accurate as we truncate the + * intermediate result of multiplication and then add/subtract + * separately. + * + * TODO: When float128_muladd() becomes available, switch this + * implementation to use that instead of separate float128_mul() + * followed by float128_add(). + */ +#define VSX_MADD_QP(op, maddflgs) \ +void helper_##op(CPUPPCState *env, uint32_t opcode) \ +{ \ + ppc_vsr_t xt_in, xa, xb, xt_out; \ + \ + getVSR(rA(opcode) + 32, &xa, env); \ + getVSR(rB(opcode) + 32, &xb, env); \ + getVSR(rD(opcode) + 32, &xt_in, env); \ + \ + xt_out = xt_in; \ + helper_reset_fpstatus(env); \ + float_status tstat = env->fp_status; \ + if (unlikely(Rc(opcode) != 0)) { \ + tstat.float_rounding_mode = float_round_to_odd; \ + } \ + set_float_exception_flags(0, &tstat); \ + xt_out.f128 = float128_mul(xa.f128, xt_in.f128, &tstat); \ + \ + if (maddflgs & float_muladd_negate_c) { \ + xb.VsrD(0) ^= 0x8000000000000000; \ + } \ + xt_out.f128 = float128_add(xt_out.f128, xb.f128, &tstat); \ + env->fp_status.float_exception_flags |= tstat.float_exception_flags; \ + \ + if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \ + if (float128_is_signaling_nan(xa.f128, &tstat) || \ + float128_is_signaling_nan(xt_in.f128, &tstat) || \ + float128_is_signaling_nan(xb.f128, &tstat)) { \ + float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1); \ + tstat.float_exception_flags &= ~float_flag_invalid; \ + } \ + if ((float128_is_infinity(xa.f128) && float128_is_zero(xt_in.f128)) ||\ + (float128_is_zero(xa.f128) && float128_is_infinity(xt_in.f128))) {\ + float_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ, 1); \ + tstat.float_exception_flags &= ~float_flag_invalid; \ + } \ + if ((tstat.float_exception_flags & float_flag_invalid) && \ + ((float128_is_infinity(xa.f128) || \ + float128_is_infinity(xt_in.f128)) && \ + float128_is_infinity(xb.f128))) { \ + float_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, 1); \ + } \ + } \ + \ + helper_compute_fprf_float128(env, xt_out.f128); \ + if ((maddflgs & float_muladd_negate_result) && \ + !float128_is_any_nan(xt_out.f128)) { \ + xt_out.VsrD(0) ^= 0x8000000000000000; \ + } \ + putVSR(rD(opcode) + 32, &xt_out, env); \ + float_check_status(env); \ +} + +VSX_MADD_QP(xsmaddqp, MADD_FLGS) +VSX_MADD_QP(xsmsubqp, MSUB_FLGS) +VSX_MADD_QP(xsnmaddqp, NMADD_FLGS) +VSX_MADD_QP(xsnmsubqp, NMSUB_FLGS) + /* VSX_SCALAR_CMP_DP - VSX scalar floating point compare double precision * op - instruction mnemonic * cmp - comparison operation diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 6d77661..eade946 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -480,12 +480,16 @@ DEF_HELPER_2(xssqrtsp, void, env, i32) DEF_HELPER_2(xsrsqrtesp, void, env, i32) DEF_HELPER_2(xsmaddasp, void, env, i32) DEF_HELPER_2(xsmaddmsp, void, env, i32) +DEF_HELPER_2(xsmaddqp, void, env, i32) DEF_HELPER_2(xsmsubasp, void, env, i32) DEF_HELPER_2(xsmsubmsp, void, env, i32) +DEF_HELPER_2(xsmsubqp, void, env, i32) DEF_HELPER_2(xsnmaddasp, void, env, i32) DEF_HELPER_2(xsnmaddmsp, void, env, i32) +DEF_HELPER_2(xsnmaddqp, void, env, i32) DEF_HELPER_2(xsnmsubasp, void, env, i32) DEF_HELPER_2(xsnmsubmsp, void, env, i32) +DEF_HELPER_2(xsnmsubqp, void, env, i32) DEF_HELPER_2(xvadddp, void, env, i32) DEF_HELPER_2(xvsubdp, void, env, i32) diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx-impl.inc.c index 7f12908..0a96e6b 100644 --- a/target/ppc/translate/vsx-impl.inc.c +++ b/target/ppc/translate/vsx-impl.inc.c @@ -853,12 +853,16 @@ GEN_VSX_HELPER_2(xssqrtsp, 0x16, 0x00, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xsrsqrtesp, 0x14, 0x00, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xsmaddasp, 0x04, 0x00, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xsmaddmsp, 0x04, 0x01, 0, PPC2_VSX207) +GEN_VSX_HELPER_2(xsmaddqp, 0x04, 0x0C, 0, PPC2_ISA300) GEN_VSX_HELPER_2(xsmsubasp, 0x04, 0x02, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xsmsubmsp, 0x04, 0x03, 0, PPC2_VSX207) +GEN_VSX_HELPER_2(xsmsubqp, 0x04, 0x0D, 0, PPC2_ISA300) GEN_VSX_HELPER_2(xsnmaddasp, 0x04, 0x10, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xsnmaddmsp, 0x04, 0x11, 0, PPC2_VSX207) +GEN_VSX_HELPER_2(xsnmaddqp, 0x04, 0x0E, 0, PPC2_ISA300) GEN_VSX_HELPER_2(xsnmsubasp, 0x04, 0x12, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xsnmsubmsp, 0x04, 0x13, 0, PPC2_VSX207) +GEN_VSX_HELPER_2(xsnmsubqp, 0x04, 0x0F, 0, PPC2_ISA300) GEN_VSX_HELPER_2(xscvsxdsp, 0x10, 0x13, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xscvuxdsp, 0x10, 0x12, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xststdcsp, 0x14, 0x12, 0, PPC2_ISA300) diff --git a/target/ppc/translate/vsx-ops.inc.c b/target/ppc/translate/vsx-ops.inc.c index 5030c4a..e770fab 100644 --- a/target/ppc/translate/vsx-ops.inc.c +++ b/target/ppc/translate/vsx-ops.inc.c @@ -237,12 +237,16 @@ GEN_XX2FORM(xssqrtsp, 0x16, 0x00, PPC2_VSX207), GEN_XX2FORM(xsrsqrtesp, 0x14, 0x00, PPC2_VSX207), GEN_XX3FORM(xsmaddasp, 0x04, 0x00, PPC2_VSX207), GEN_XX3FORM(xsmaddmsp, 0x04, 0x01, PPC2_VSX207), +GEN_VSX_XFORM_300(xsmaddqp, 0x04, 0x0C, 0x0), GEN_XX3FORM(xsmsubasp, 0x04, 0x02, PPC2_VSX207), GEN_XX3FORM(xsmsubmsp, 0x04, 0x03, PPC2_VSX207), +GEN_VSX_XFORM_300(xsmsubqp, 0x04, 0x0D, 0x0), GEN_XX3FORM(xsnmaddasp, 0x04, 0x10, PPC2_VSX207), GEN_XX3FORM(xsnmaddmsp, 0x04, 0x11, PPC2_VSX207), +GEN_VSX_XFORM_300(xsnmaddqp, 0x04, 0x0E, 0x0), GEN_XX3FORM(xsnmsubasp, 0x04, 0x12, PPC2_VSX207), GEN_XX3FORM(xsnmsubmsp, 0x04, 0x13, PPC2_VSX207), +GEN_VSX_XFORM_300(xsnmsubqp, 0x04, 0x0F, 0x0), GEN_XX2FORM(xscvsxdsp, 0x10, 0x13, PPC2_VSX207), GEN_XX2FORM(xscvuxdsp, 0x10, 0x12, PPC2_VSX207),