From patchwork Thu Jan 5 22:13:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 13090595 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0350CC54EBC for ; Thu, 5 Jan 2023 22:15:39 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pDYUY-0006xk-UI; Thu, 05 Jan 2023 17:13:42 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUV-0006v3-VF for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:41 -0500 Received: from mx0b-0031df01.pphosted.com ([205.220.180.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUU-000283-4o for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:39 -0500 Received: from pps.filterd (m0279873.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 305M2knl002696; Thu, 5 Jan 2023 22:13:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=QEPk+a1hU+rjuHR5CcRpNKnD8Ev5S/odWrnpSh7TKNU=; b=jj6A6+lQVyo4+sLup0QvpLoFiMcIVucNMqsyq/4Tt7Bq/YnnIGNoKchIbgXxaI9gjbPm 5h3kCVYNAaOeeUbTn1ALQcbvoUXfBfdHzrU+eD3QMiHW4Q6EeuH6jKn0N0LBRjjgiQW1 MpEad5TAZpueSJjteLf0m2tN1uWTkOP7vqLRoof+8UsQlCTeRDEUdjdxOLpUPXCdX4Cj 7JTJNxNwZl3YbsBqd3uuuz9acAoa+6seuZudIPlPhgBBnXMqZqCDus/s5p69zj/Pq/cg qY9+NBunNt1/aghK8DlPcxoOq6Nyd7oJdXRFceTJ88Rz5vARzwdyp2dkiySyvkMJM4Ts UA== Received: from nalasppmta05.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3mx3s1rcs2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 05 Jan 2023 22:13:36 +0000 Received: from pps.filterd (NALASPPMTA05.qualcomm.com [127.0.0.1]) by NALASPPMTA05.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 305MDZuV031217; Thu, 5 Jan 2023 22:13:35 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA05.qualcomm.com (PPS) with ESMTP id 3mx2kp91ag-1; Thu, 05 Jan 2023 22:13:34 +0000 Received: from NALASPPMTA05.qualcomm.com (NALASPPMTA05.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 305MDYiF031200; Thu, 5 Jan 2023 22:13:34 GMT Received: from hu-devc-lv-u18-c.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.235.220]) by NALASPPMTA05.qualcomm.com (PPS) with ESMTP id 305MDYHZ031197; Thu, 05 Jan 2023 22:13:34 +0000 Received: by hu-devc-lv-u18-c.qualcomm.com (Postfix, from userid 47164) id 34EB85000A8; Thu, 5 Jan 2023 14:13:34 -0800 (PST) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, ale@rev.ng, anjo@rev.ng, bcain@quicinc.com, quic_mathbern@quicinc.com Subject: [PATCH v3 1/9] Hexagon (target/hexagon) Add overrides for jumpr31 instructions Date: Thu, 5 Jan 2023 14:13:23 -0800 Message-Id: <20230105221331.12069-2-tsimpson@quicinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230105221331.12069-1-tsimpson@quicinc.com> References: <20230105221331.12069-1-tsimpson@quicinc.com> MIME-Version: 1.0 X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: YCCiwZl1FAd-uCXVvVAkQREAR2-LWBNg X-Proofpoint-GUID: YCCiwZl1FAd-uCXVvVAkQREAR2-LWBNg X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-05_12,2023-01-05_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 priorityscore=1501 impostorscore=0 lowpriorityscore=0 mlxscore=0 bulkscore=0 adultscore=0 phishscore=0 malwarescore=0 clxscore=1015 mlxlogscore=629 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301050174 Received-SPF: pass client-ip=205.220.180.131; envelope-from=tsimpson@qualcomm.com; helo=mx0b-0031df01.pphosted.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Add overrides for SL2_jumpr31 Unconditional SL2_jumpr31_t Predicated true (old value) SL2_jumpr31_f Predicated false (old value) SL2_jumpr31_tnew Predicated true (new value) SL2_jumpr31_fnew Predicated false (new value) Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg.h | 15 ++++++++++++++- target/hexagon/genptr.c | 10 +++++++++- 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h index 19697b42a5..d644e59a63 100644 --- a/target/hexagon/gen_tcg.h +++ b/target/hexagon/gen_tcg.h @@ -1,5 +1,5 @@ /* - * Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved. + * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -1015,6 +1015,19 @@ #define fGEN_TCG_S2_asl_r_r_sat(SHORTCODE) \ gen_asl_r_r_sat(RdV, RsV, RtV) +#define fGEN_TCG_SL2_jumpr31(SHORTCODE) \ + gen_jumpr(ctx, hex_gpr[HEX_REG_LR]) + +#define fGEN_TCG_SL2_jumpr31_t(SHORTCODE) \ + gen_cond_jumpr31(ctx, TCG_COND_EQ, hex_pred[0]) +#define fGEN_TCG_SL2_jumpr31_f(SHORTCODE) \ + gen_cond_jumpr31(ctx, TCG_COND_NE, hex_pred[0]) + +#define fGEN_TCG_SL2_jumpr31_tnew(SHORTCODE) \ + gen_cond_jumpr31(ctx, TCG_COND_EQ, hex_new_pred_value[0]) +#define fGEN_TCG_SL2_jumpr31_fnew(SHORTCODE) \ + gen_cond_jumpr31(ctx, TCG_COND_NE, hex_new_pred_value[0]) + /* Floating point */ #define fGEN_TCG_F2_conv_sf2df(SHORTCODE) \ gen_helper_conv_sf2df(RddV, cpu_env, RsV) diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c index 6cf2e0ed43..a8997250d3 100644 --- a/target/hexagon/genptr.c +++ b/target/hexagon/genptr.c @@ -1,5 +1,5 @@ /* - * Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved. + * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -553,6 +553,14 @@ static void gen_cond_jumpr(DisasContext *ctx, TCGv dst_pc, gen_write_new_pc_addr(ctx, dst_pc, cond, pred); } +static void gen_cond_jumpr31(DisasContext *ctx, TCGCond cond, TCGv pred) +{ + TCGv LSB = tcg_temp_new(); + tcg_gen_andi_tl(LSB, pred, 1); + gen_cond_jumpr(ctx, hex_gpr[HEX_REG_LR], cond, LSB); + tcg_temp_free(LSB); +} + static void gen_cond_jump(DisasContext *ctx, TCGCond cond, TCGv pred, int pc_off) { From patchwork Thu Jan 5 22:13:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 13090605 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 39664C4708E for ; Thu, 5 Jan 2023 22:22:44 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pDYUb-00070w-Hf; Thu, 05 Jan 2023 17:13:45 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUY-0006xp-Uj for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:42 -0500 Received: from mx0a-0031df01.pphosted.com ([205.220.168.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUU-000287-Jb for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:42 -0500 Received: from pps.filterd (m0279862.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 305KM4gg007125; Thu, 5 Jan 2023 22:13:35 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=e/azP095Z8ooGOenjsRa0f38WjV1F9p/J2NkClPTqqQ=; b=cIxnfPW6Nu2xHH0xj8dAKOe/gSt9i6KY18ishSY0vF7ixoPq5karSAusFAw1yvvt3Vn8 AX6xn9GG3HF9YM7o2zbv0W57uKpUcPj/g5waSd7OpB1BKvD00ID8djLTO/2SpNbH2KbC TXHHy7tyqwzcjRLIo2sQ7vpGuSN9gBzr4NA0v0A1ZdXuNdLnyegxOi3iQzYTPNec7e54 QNLQHk0hk8xNCI3sqreICNd+vhl/Kbd7o2MlzcmnuLHbFZoCsvGY2iv5mZstVjrJJTQB abBxKKBFD9QD6owR0eNxbgEe+yWG/Xl0oAOB5EQ/Nt76g6KWaeE5MPWX1vrAPSs1v4qc aQ== Received: from nalasppmta04.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3mwj1rthv9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 05 Jan 2023 22:13:35 +0000 Received: from pps.filterd (NALASPPMTA04.qualcomm.com [127.0.0.1]) by NALASPPMTA04.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 305MDYZk027146; Thu, 5 Jan 2023 22:13:34 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA04.qualcomm.com (PPS) with ESMTP id 3mx6kmg393-1; Thu, 05 Jan 2023 22:13:34 +0000 Received: from NALASPPMTA04.qualcomm.com (NALASPPMTA04.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 305MDYkQ027139; Thu, 5 Jan 2023 22:13:34 GMT Received: from hu-devc-lv-u18-c.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.235.220]) by NALASPPMTA04.qualcomm.com (PPS) with ESMTP id 305MDYML027138; Thu, 05 Jan 2023 22:13:34 +0000 Received: by hu-devc-lv-u18-c.qualcomm.com (Postfix, from userid 47164) id 37BA55000AE; Thu, 5 Jan 2023 14:13:34 -0800 (PST) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, ale@rev.ng, anjo@rev.ng, bcain@quicinc.com, quic_mathbern@quicinc.com Subject: [PATCH v3 2/9] Hexagon (target/hexagon) Add overrides for callr Date: Thu, 5 Jan 2023 14:13:24 -0800 Message-Id: <20230105221331.12069-3-tsimpson@quicinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230105221331.12069-1-tsimpson@quicinc.com> References: <20230105221331.12069-1-tsimpson@quicinc.com> MIME-Version: 1.0 X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: CNTZhDntmj-Hs1dEiMsjEspxULO5YRRi X-Proofpoint-GUID: CNTZhDntmj-Hs1dEiMsjEspxULO5YRRi X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-05_12,2023-01-05_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 lowpriorityscore=0 spamscore=0 mlxlogscore=630 malwarescore=0 suspectscore=0 mlxscore=0 priorityscore=1501 clxscore=1015 adultscore=0 bulkscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301050174 Received-SPF: pass client-ip=205.220.168.131; envelope-from=tsimpson@qualcomm.com; helo=mx0a-0031df01.pphosted.com X-Spam_score_int: -24 X-Spam_score: -2.5 X-Spam_bar: -- X-Spam_report: (-2.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Add overrides for J2_callr J2_callrt J2_callrf Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg.h | 6 ++++++ target/hexagon/macros.h | 12 +----------- target/hexagon/genptr.c | 20 ++++++++++++++++++++ 3 files changed, 27 insertions(+), 11 deletions(-) diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h index d644e59a63..9e8f3373ad 100644 --- a/target/hexagon/gen_tcg.h +++ b/target/hexagon/gen_tcg.h @@ -614,11 +614,17 @@ #define fGEN_TCG_J2_call(SHORTCODE) \ gen_call(ctx, riV) +#define fGEN_TCG_J2_callr(SHORTCODE) \ + gen_callr(ctx, RsV) #define fGEN_TCG_J2_callt(SHORTCODE) \ gen_cond_call(ctx, PuV, TCG_COND_EQ, riV) #define fGEN_TCG_J2_callf(SHORTCODE) \ gen_cond_call(ctx, PuV, TCG_COND_NE, riV) +#define fGEN_TCG_J2_callrt(SHORTCODE) \ + gen_cond_callr(ctx, TCG_COND_EQ, PuV, RsV) +#define fGEN_TCG_J2_callrf(SHORTCODE) \ + gen_cond_callr(ctx, TCG_COND_NE, PuV, RsV) #define fGEN_TCG_J2_endloop0(SHORTCODE) \ gen_endloop0(ctx) diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h index cd64bb8eec..8f1f82f8da 100644 --- a/target/hexagon/macros.h +++ b/target/hexagon/macros.h @@ -1,5 +1,5 @@ /* - * Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved. + * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -421,16 +421,6 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int shift) #define fBRANCH(LOC, TYPE) fWRITE_NPC(LOC) #define fJUMPR(REGNO, TARGET, TYPE) fBRANCH(TARGET, COF_TYPE_JUMPR) #define fHINTJR(TARGET) { /* Not modelled in qemu */} -#define fCALL(A) \ - do { \ - fWRITE_LR(fREAD_NPC()); \ - fBRANCH(A, COF_TYPE_CALL); \ - } while (0) -#define fCALLR(A) \ - do { \ - fWRITE_LR(fREAD_NPC()); \ - fBRANCH(A, COF_TYPE_CALLR); \ - } while (0) #define fWRITE_LOOP_REGS0(START, COUNT) \ do { \ WRITE_RREG(HEX_REG_LC0, COUNT); \ diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c index a8997250d3..d15df1dd28 100644 --- a/target/hexagon/genptr.c +++ b/target/hexagon/genptr.c @@ -670,6 +670,14 @@ static void gen_call(DisasContext *ctx, int pc_off) gen_write_new_pc_pcrel(ctx, pc_off, TCG_COND_ALWAYS, NULL); } +static void gen_callr(DisasContext *ctx, TCGv new_pc) +{ + TCGv next_PC = + tcg_constant_tl(ctx->pkt->pc + ctx->pkt->encod_pkt_size_in_bytes); + gen_log_reg_write(HEX_REG_LR, next_PC); + gen_write_new_pc_addr(ctx, new_pc, TCG_COND_ALWAYS, NULL); +} + static void gen_cond_call(DisasContext *ctx, TCGv pred, TCGCond cond, int pc_off) { @@ -686,6 +694,18 @@ static void gen_cond_call(DisasContext *ctx, TCGv pred, gen_set_label(skip); } +static void gen_cond_callr(DisasContext *ctx, + TCGCond cond, TCGv pred, TCGv new_pc) +{ + TCGv lsb = tcg_temp_new(); + TCGLabel *skip = gen_new_label(); + tcg_gen_andi_tl(lsb, pred, 1); + tcg_gen_brcondi_tl(cond, lsb, 0, skip); + tcg_temp_free(lsb); + gen_callr(ctx, new_pc); + gen_set_label(skip); +} + static void gen_endloop0(DisasContext *ctx) { TCGv lpcfg = tcg_temp_local_new(); From patchwork Thu Jan 5 22:13:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 13090606 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CE0D3C3DA7A for ; Thu, 5 Jan 2023 22:24:37 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pDYUZ-0006yd-TK; Thu, 05 Jan 2023 17:13:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUX-0006wb-FG for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:41 -0500 Received: from mx0b-0031df01.pphosted.com ([205.220.180.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUU-000284-Ao for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:41 -0500 Received: from pps.filterd (m0279871.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 305M5T2j002433; Thu, 5 Jan 2023 22:13:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=+HnjHbWF3m8nUbYCRwHZ9YFy0lRojRf6wFBOBycKutE=; b=TFfz8gbrTlBIdiNvJZefjDDj8BYgoJYGucz7zv1KgrWsfmo5VpAnvI4PYb71r2Ovlm4c jAlTiTlocXP1ld7nJyoyQw51DadeycO5etg8RmkJkWe4KqdxdqSgq1/nUgTiS8YhPJ64 Uk08M5U1+ZItDOmpGRxHZNZt3qmk4RF+hHyKBHhPXVNcGtJzgRBY04kryPI2e4MEbPdy QU9Ps+Bj4g5Tj1s9Cja6FpwfiUjuGqEX1O1Wm9Nao2lEi4PTxcqDlfANjWFJqSPJp00M jKVXddr8jBZs3N5kt2Z9+gVnmO2swTEY986t8wNqwJyiqZSkQSYeQOTCXHmTD5YArirp tA== Received: from nalasppmta05.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3mwuuusdrr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 05 Jan 2023 22:13:36 +0000 Received: from pps.filterd (NALASPPMTA05.qualcomm.com [127.0.0.1]) by NALASPPMTA05.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 305MDZAE031219; Thu, 5 Jan 2023 22:13:35 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA05.qualcomm.com (PPS) with ESMTP id 3mx2kp91aj-1; Thu, 05 Jan 2023 22:13:35 +0000 Received: from NALASPPMTA05.qualcomm.com (NALASPPMTA05.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 305MDYrs031202; Thu, 5 Jan 2023 22:13:34 GMT Received: from hu-devc-lv-u18-c.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.235.220]) by NALASPPMTA05.qualcomm.com (PPS) with ESMTP id 305MDYQ4031198; Thu, 05 Jan 2023 22:13:34 +0000 Received: by hu-devc-lv-u18-c.qualcomm.com (Postfix, from userid 47164) id 3A8FD5000AF; Thu, 5 Jan 2023 14:13:34 -0800 (PST) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, ale@rev.ng, anjo@rev.ng, bcain@quicinc.com, quic_mathbern@quicinc.com Subject: [PATCH v3 3/9] Hexagon (target/hexagon) Add overrides for endloop1/endloop01 Date: Thu, 5 Jan 2023 14:13:25 -0800 Message-Id: <20230105221331.12069-4-tsimpson@quicinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230105221331.12069-1-tsimpson@quicinc.com> References: <20230105221331.12069-1-tsimpson@quicinc.com> MIME-Version: 1.0 X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: mZdBhqaJBykZlRZq2HYBdTrob3-UfHXC X-Proofpoint-ORIG-GUID: mZdBhqaJBykZlRZq2HYBdTrob3-UfHXC X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-05_12,2023-01-05_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 impostorscore=0 priorityscore=1501 adultscore=0 clxscore=1015 suspectscore=0 spamscore=0 mlxscore=0 phishscore=0 mlxlogscore=657 malwarescore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301050174 Received-SPF: pass client-ip=205.220.180.131; envelope-from=tsimpson@qualcomm.com; helo=mx0b-0031df01.pphosted.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg.h | 4 ++ target/hexagon/genptr.c | 79 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 83 insertions(+) diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h index 9e8f3373ad..6267f51ccc 100644 --- a/target/hexagon/gen_tcg.h +++ b/target/hexagon/gen_tcg.h @@ -628,6 +628,10 @@ #define fGEN_TCG_J2_endloop0(SHORTCODE) \ gen_endloop0(ctx) +#define fGEN_TCG_J2_endloop1(SHORTCODE) \ + gen_endloop1(ctx) +#define fGEN_TCG_J2_endloop01(SHORTCODE) \ + gen_endloop01(ctx) /* * Compound compare and jump instructions diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c index d15df1dd28..94054b10e6 100644 --- a/target/hexagon/genptr.c +++ b/target/hexagon/genptr.c @@ -763,6 +763,85 @@ static void gen_endloop0(DisasContext *ctx) tcg_temp_free(lpcfg); } +static void gen_endloop1(DisasContext *ctx) +{ + /* + * if (hex_gpr[HEX_REG_LC1] > 1) { + * PC = hex_gpr[HEX_REG_SA1]; + * hex_new_value[HEX_REG_LC1] = hex_gpr[HEX_REG_LC1] - 1; + * } + */ + TCGLabel *label = gen_new_label(); + tcg_gen_brcondi_tl(TCG_COND_LEU, hex_gpr[HEX_REG_LC1], 1, label); + { + gen_jumpr(ctx, hex_gpr[HEX_REG_SA1]); + tcg_gen_subi_tl(hex_new_value[HEX_REG_LC1], hex_gpr[HEX_REG_LC1], 1); + } + gen_set_label(label); +} + +static void gen_endloop01(DisasContext *ctx) +{ + TCGv lpcfg = tcg_temp_local_new(); + + GET_USR_FIELD(USR_LPCFG, lpcfg); + + /* + * if (lpcfg == 1) { + * hex_new_pred_value[3] = 0xff; + * hex_pred_written |= 1 << 3; + * } + */ + TCGLabel *label1 = gen_new_label(); + tcg_gen_brcondi_tl(TCG_COND_NE, lpcfg, 1, label1); + { + tcg_gen_movi_tl(hex_new_pred_value[3], 0xff); + tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << 3); + } + gen_set_label(label1); + + /* + * if (lpcfg) { + * SET_USR_FIELD(USR_LPCFG, lpcfg - 1); + * } + */ + TCGLabel *label2 = gen_new_label(); + tcg_gen_brcondi_tl(TCG_COND_EQ, lpcfg, 0, label2); + { + tcg_gen_subi_tl(lpcfg, lpcfg, 1); + SET_USR_FIELD(USR_LPCFG, lpcfg); + } + gen_set_label(label2); + + /* + * if (hex_gpr[HEX_REG_LC0] > 1) { + * PC = hex_gpr[HEX_REG_SA0]; + * hex_new_value[HEX_REG_LC0] = hex_gpr[HEX_REG_LC0] - 1; + * } else { + * if (hex_gpr[HEX_REG_LC1] > 1) { + * hex_next_pc = hex_gpr[HEX_REG_SA1]; + * hex_new_value[HEX_REG_LC1] = hex_gpr[HEX_REG_LC1] - 1; + * } + * } + */ + TCGLabel *label3 = gen_new_label(); + TCGLabel *done = gen_new_label(); + tcg_gen_brcondi_tl(TCG_COND_LEU, hex_gpr[HEX_REG_LC0], 1, label3); + { + gen_jumpr(ctx, hex_gpr[HEX_REG_SA0]); + tcg_gen_subi_tl(hex_new_value[HEX_REG_LC0], hex_gpr[HEX_REG_LC0], 1); + tcg_gen_br(done); + } + gen_set_label(label3); + tcg_gen_brcondi_tl(TCG_COND_LEU, hex_gpr[HEX_REG_LC1], 1, done); + { + gen_jumpr(ctx, hex_gpr[HEX_REG_SA1]); + tcg_gen_subi_tl(hex_new_value[HEX_REG_LC1], hex_gpr[HEX_REG_LC1], 1); + } + gen_set_label(done); + tcg_temp_free(lpcfg); +} + static void gen_cmp_jumpnv(DisasContext *ctx, TCGCond cond, TCGv val, TCGv src, int pc_off) { From patchwork Thu Jan 5 22:13:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 13090634 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 27CFAC3DA7A for ; Thu, 5 Jan 2023 22:36:16 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pDYUc-00071H-0x; Thu, 05 Jan 2023 17:13:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUX-0006x2-Vd for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:41 -0500 Received: from mx0a-0031df01.pphosted.com ([205.220.168.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUU-000288-Ao for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:41 -0500 Received: from pps.filterd (m0279865.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 305Luilx013584; Thu, 5 Jan 2023 22:13:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=CDAPYuj4LjnC9b1fTx08cXtgdcwbYJhNcC2qESzGJLw=; b=THDxa19Z8kCBm/mUUkUUlkhcNM0lptCzHIq8tLJV0asH5IUTK2Tdm7cchqEtbWpkb+58 b8tqBBKUo8HzkHubL0iGzHyj3JYUmjJ9ANWbSGgEFbywtlEw3kOyQs9JvqbzSAD97bmH YWdbS4NxSqvKvyIJAbg7TbuN3K6K3m7H/Hr+TJ+OkgWDQVgkghL8lVVPWtkXNz+auU9m 8qHDrAycFDDmhFNzKHTa3IqTyD0OqOqpnRjeUIUrkExGjAj5FXqeeg7Rnol+xez7meT5 yjPfibu1i3bU8hV67CFXtTZL12m77b2AdQp8e+hDFFeIyTzatMe1DnlJ5EheCsAOUD4W GA== Received: from nalasppmta05.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3mwwfs97xm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 05 Jan 2023 22:13:35 +0000 Received: from pps.filterd (NALASPPMTA05.qualcomm.com [127.0.0.1]) by NALASPPMTA05.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 305MDZit031218; Thu, 5 Jan 2023 22:13:35 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA05.qualcomm.com (PPS) with ESMTP id 3mx2kp91ah-1; Thu, 05 Jan 2023 22:13:35 +0000 Received: from NALASPPMTA05.qualcomm.com (NALASPPMTA05.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 305MDYrS031201; Thu, 5 Jan 2023 22:13:34 GMT Received: from hu-devc-lv-u18-c.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.235.220]) by NALASPPMTA05.qualcomm.com (PPS) with ESMTP id 305MDYRE031199; Thu, 05 Jan 2023 22:13:34 +0000 Received: by hu-devc-lv-u18-c.qualcomm.com (Postfix, from userid 47164) id 3DE8C5000B0; Thu, 5 Jan 2023 14:13:34 -0800 (PST) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, ale@rev.ng, anjo@rev.ng, bcain@quicinc.com, quic_mathbern@quicinc.com Subject: [PATCH v3 4/9] Hexagon (target/hexagon) Add overrides for dealloc-return instructions Date: Thu, 5 Jan 2023 14:13:26 -0800 Message-Id: <20230105221331.12069-5-tsimpson@quicinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230105221331.12069-1-tsimpson@quicinc.com> References: <20230105221331.12069-1-tsimpson@quicinc.com> MIME-Version: 1.0 X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: zOTnqKFY3Y36SY76960ZnZi6ekblM3uw X-Proofpoint-ORIG-GUID: zOTnqKFY3Y36SY76960ZnZi6ekblM3uw X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-05_12,2023-01-05_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=1 spamscore=1 suspectscore=0 phishscore=0 impostorscore=0 clxscore=1015 priorityscore=1501 mlxlogscore=202 mlxscore=1 adultscore=0 bulkscore=0 malwarescore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301050174 Received-SPF: pass client-ip=205.220.168.131; envelope-from=tsimpson@qualcomm.com; helo=mx0a-0031df01.pphosted.com X-Spam_score_int: -24 X-Spam_score: -2.5 X-Spam_bar: -- X-Spam_report: (-2.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org These instructions perform a deallocframe+return (jumpr r31) Add overrides for L4_return SL2_return L4_return_t L4_return_f L4_return_tnew_pt L4_return_fnew_pt L4_return_tnew_pnt L4_return_fnew_pnt SL2_return_t SL2_return_f SL2_return_tnew SL2_return_fnew This patch eliminates the last helper that uses write_new_pc, so we remove it from op_helper.c Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg.h | 54 ++++++++++++++++++++++++ target/hexagon/genptr.c | 86 ++++++++++++++++++++++++++++++++++++++ target/hexagon/op_helper.c | 26 +----------- 3 files changed, 141 insertions(+), 25 deletions(-) diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h index 6267f51ccc..8282ff3fc5 100644 --- a/target/hexagon/gen_tcg.h +++ b/target/hexagon/gen_tcg.h @@ -508,6 +508,60 @@ #define fGEN_TCG_S2_storerinew_pcr(SHORTCODE) \ fGEN_TCG_STORE_pcr(2, fSTORE(1, 4, EA, NtN)) +/* + * dealloc_return + * Assembler mapped to + * r31:30 = dealloc_return(r30):raw + */ +#define fGEN_TCG_L4_return(SHORTCODE) \ + gen_return(ctx, RddV, RsV) + +/* + * sub-instruction version (no RddV, so handle it manually) + */ +#define fGEN_TCG_SL2_return(SHORTCODE) \ + do { \ + TCGv_i64 RddV = tcg_temp_new_i64(); \ + gen_return(ctx, RddV, hex_gpr[HEX_REG_FP]); \ + gen_log_reg_write_pair(HEX_REG_FP, RddV); \ + tcg_temp_free_i64(RddV); \ + } while (0) + +/* + * Conditional returns follow this naming convention + * _t predicate true + * _f predicate false + * _tnew_pt predicate.new true predict taken + * _fnew_pt predicate.new false predict taken + * _tnew_pnt predicate.new true predict not taken + * _fnew_pnt predicate.new false predict not taken + * Predictions are not modelled in QEMU + * + * Example: + * if (p1) r31:30 = dealloc_return(r30):raw + */ +#define fGEN_TCG_L4_return_t(SHORTCODE) \ + gen_cond_return(ctx, RddV, RsV, PvV, TCG_COND_EQ); +#define fGEN_TCG_L4_return_f(SHORTCODE) \ + gen_cond_return(ctx, RddV, RsV, PvV, TCG_COND_NE) +#define fGEN_TCG_L4_return_tnew_pt(SHORTCODE) \ + gen_cond_return(ctx, RddV, RsV, PvN, TCG_COND_EQ) +#define fGEN_TCG_L4_return_fnew_pt(SHORTCODE) \ + gen_cond_return(ctx, RddV, RsV, PvN, TCG_COND_NE) +#define fGEN_TCG_L4_return_tnew_pnt(SHORTCODE) \ + gen_cond_return(ctx, RddV, RsV, PvN, TCG_COND_EQ) +#define fGEN_TCG_L4_return_fnew_pnt(SHORTCODE) \ + gen_cond_return(ctx, RddV, RsV, PvN, TCG_COND_NE) + +#define fGEN_TCG_SL2_return_t(SHORTCODE) \ + gen_cond_return_subinsn(ctx, TCG_COND_EQ, hex_pred[0]) +#define fGEN_TCG_SL2_return_f(SHORTCODE) \ + gen_cond_return_subinsn(ctx, TCG_COND_NE, hex_pred[0]) +#define fGEN_TCG_SL2_return_tnew(SHORTCODE) \ + gen_cond_return_subinsn(ctx, TCG_COND_EQ, hex_new_pred_value[0]) +#define fGEN_TCG_SL2_return_fnew(SHORTCODE) \ + gen_cond_return_subinsn(ctx, TCG_COND_NE, hex_new_pred_value[0]) + /* * Mathematical operations with more than one definition require * special handling diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c index 94054b10e6..7243bb00aa 100644 --- a/target/hexagon/genptr.c +++ b/target/hexagon/genptr.c @@ -706,6 +706,92 @@ static void gen_cond_callr(DisasContext *ctx, gen_set_label(skip); } +/* frame ^= (int64_t)FRAMEKEY << 32 */ +static void gen_frame_unscramble(TCGv_i64 frame) +{ + TCGv_i64 framekey = tcg_temp_new_i64(); + tcg_gen_extu_i32_i64(framekey, hex_gpr[HEX_REG_FRAMEKEY]); + tcg_gen_shli_i64(framekey, framekey, 32); + tcg_gen_xor_i64(frame, frame, framekey); + tcg_temp_free_i64(framekey); +} + +static void gen_load_frame(DisasContext *ctx, TCGv_i64 frame, TCGv EA) +{ + Insn *insn = ctx->insn; /* Needed for CHECK_NOSHUF */ + CHECK_NOSHUF(EA, 8); + tcg_gen_qemu_ld64(frame, EA, ctx->mem_idx); +} + +static void gen_return_base(DisasContext *ctx, TCGv_i64 dst, TCGv src, + TCGv r29) +{ + /* + * frame = *src + * dst = frame_unscramble(frame) + * SP = src + 8 + * PC = dst.w[1] + */ + TCGv_i64 frame = tcg_temp_new_i64(); + TCGv r31 = tcg_temp_new(); + + gen_load_frame(ctx, frame, src); + gen_frame_unscramble(frame); + tcg_gen_mov_i64(dst, frame); + tcg_gen_addi_tl(r29, src, 8); + tcg_gen_extrh_i64_i32(r31, dst); + gen_jumpr(ctx, r31); + + tcg_temp_free_i64(frame); + tcg_temp_free(r31); +} + +static void gen_return(DisasContext *ctx, TCGv_i64 dst, TCGv src) +{ + TCGv r29 = tcg_temp_new(); + gen_return_base(ctx, dst, src, r29); + gen_log_reg_write(HEX_REG_SP, r29); + tcg_temp_free(r29); +} + +/* if (pred) dst = dealloc_return(src):raw */ +static void gen_cond_return(DisasContext *ctx, TCGv_i64 dst, TCGv src, + TCGv pred, TCGCond cond) +{ + TCGv LSB = tcg_temp_new(); + TCGv mask = tcg_temp_new(); + TCGv r29 = tcg_temp_local_new(); + TCGLabel *skip = gen_new_label(); + tcg_gen_andi_tl(LSB, pred, 1); + + /* Initialize the results in case the predicate is false */ + tcg_gen_movi_i64(dst, 0); + tcg_gen_movi_tl(r29, 0); + + /* Set the bit in hex_slot_cancelled if the predicate is flase */ + tcg_gen_movi_tl(mask, 1 << ctx->insn->slot); + tcg_gen_or_tl(mask, hex_slot_cancelled, mask); + tcg_gen_movcond_tl(cond, hex_slot_cancelled, LSB, tcg_constant_tl(0), + mask, hex_slot_cancelled); + tcg_temp_free(mask); + + tcg_gen_brcondi_tl(cond, LSB, 0, skip); + tcg_temp_free(LSB); + gen_return_base(ctx, dst, src, r29); + gen_set_label(skip); + gen_log_predicated_reg_write(HEX_REG_SP, r29, ctx->insn->slot); + tcg_temp_free(r29); +} + +/* sub-instruction version (no RddV, so handle it manually) */ +static void gen_cond_return_subinsn(DisasContext *ctx, TCGCond cond, TCGv pred) +{ + TCGv_i64 RddV = tcg_temp_local_new_i64(); + gen_cond_return(ctx, RddV, hex_gpr[HEX_REG_FP], pred, cond); + gen_log_predicated_reg_write_pair(HEX_REG_FP, RddV, ctx->insn->slot); + tcg_temp_free_i64(RddV); +} + static void gen_endloop0(DisasContext *ctx) { TCGv lpcfg = tcg_temp_local_new(); diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c index 35449ef524..38b8aee193 100644 --- a/target/hexagon/op_helper.c +++ b/target/hexagon/op_helper.c @@ -1,5 +1,5 @@ /* - * Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved. + * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -105,30 +105,6 @@ void log_store64(CPUHexagonState *env, target_ulong addr, env->mem_log_stores[slot].data64 = val; } -void write_new_pc(CPUHexagonState *env, bool pkt_has_multi_cof, - target_ulong addr) -{ - HEX_DEBUG_LOG("write_new_pc(0x" TARGET_FMT_lx ")\n", addr); - - if (pkt_has_multi_cof) { - /* - * If more than one branch is taken in a packet, only the first one - * is actually done. - */ - if (env->branch_taken) { - HEX_DEBUG_LOG("INFO: multiple branches taken in same packet, " - "ignoring the second one\n"); - } else { - fCHECK_PCALIGN(addr); - env->gpr[HEX_REG_PC] = addr; - env->branch_taken = 1; - } - } else { - fCHECK_PCALIGN(addr); - env->gpr[HEX_REG_PC] = addr; - } -} - /* Handy place to set a breakpoint */ void HELPER(debug_start_packet)(CPUHexagonState *env) { From patchwork Thu Jan 5 22:13:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 13090624 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6196EC54EBC for ; Thu, 5 Jan 2023 22:28:38 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pDYUd-00072t-DY; Thu, 05 Jan 2023 17:13:47 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUa-0006zv-Nv for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:44 -0500 Received: from mx0a-0031df01.pphosted.com ([205.220.168.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUU-00028G-Jg for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:44 -0500 Received: from pps.filterd (m0279863.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 305EuIPm009390; Thu, 5 Jan 2023 22:13:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=/X95UmrrnyrNxZL7P8tu+335gayQxA3I0PM8ZfMkf7A=; b=eBGI+ILq1FS3pWkKoTL656ZCO26GsRJ4aK//vOWUnU5A1DCq+bDQTo/8RMsEa1Q4uncn YUX4QYlclC6xIcFORGZbI3wJy8uTh5ATSaCCYJu7kpdMUI+w6yGBdUojRK1co52HfdYH mZzoQ4ZCGTJrWILQf4kjeoa7dk0jqWJm0iAgIUJYRzlMuIWdrymfES0ABQXo50b2LIHn oDMdK00d+d8GZf14FHTtsEQMtHpc34jGag/QDCT9Ab4fht3x9l7GQVNoFGpqenRZPn9n Hhnvl1L3lmUHzxdtEq2KN/MzodnQaU48ANMer2rpafMLgIaMVjN2x3gzbwW5BE06yZiE 1w== Received: from nalasppmta04.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3mwpwnt16n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 05 Jan 2023 22:13:36 +0000 Received: from pps.filterd (NALASPPMTA04.qualcomm.com [127.0.0.1]) by NALASPPMTA04.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 305MDZRg027155; Thu, 5 Jan 2023 22:13:35 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA04.qualcomm.com (PPS) with ESMTP id 3mx6kmg398-1; Thu, 05 Jan 2023 22:13:35 +0000 Received: from NALASPPMTA04.qualcomm.com (NALASPPMTA04.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 305MDZcr027149; Thu, 5 Jan 2023 22:13:35 GMT Received: from hu-devc-lv-u18-c.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.235.220]) by NALASPPMTA04.qualcomm.com (PPS) with ESMTP id 305MDYkS027145; Thu, 05 Jan 2023 22:13:35 +0000 Received: by hu-devc-lv-u18-c.qualcomm.com (Postfix, from userid 47164) id 41A1D5000B1; Thu, 5 Jan 2023 14:13:34 -0800 (PST) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, ale@rev.ng, anjo@rev.ng, bcain@quicinc.com, quic_mathbern@quicinc.com Subject: [PATCH v3 5/9] Hexagon (target/hexagon) Analyze packet before generating TCG Date: Thu, 5 Jan 2023 14:13:27 -0800 Message-Id: <20230105221331.12069-6-tsimpson@quicinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230105221331.12069-1-tsimpson@quicinc.com> References: <20230105221331.12069-1-tsimpson@quicinc.com> MIME-Version: 1.0 X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: O0MHOTTyKGdjYX2Qz9Hi0HufW6RzLaII X-Proofpoint-GUID: O0MHOTTyKGdjYX2Qz9Hi0HufW6RzLaII X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-05_12,2023-01-05_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 mlxlogscore=851 suspectscore=0 adultscore=0 lowpriorityscore=0 phishscore=0 spamscore=0 malwarescore=0 priorityscore=1501 bulkscore=0 impostorscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301050174 Received-SPF: pass client-ip=205.220.168.131; envelope-from=tsimpson@qualcomm.com; helo=mx0a-0031df01.pphosted.com X-Spam_score_int: -24 X-Spam_score: -2.5 X-Spam_bar: -- X-Spam_report: (-2.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org We create a new generator that creates an analyze_ function for each instruction. Currently, these functions record the writes to R, P, and C registers by calling ctx_log_reg_write[_pair] or ctx_log_pred_write. During gen_start_packet, we invoke the analyze_ function for each instruction in the packet, and we mark the implicit register and predicate writes. Doing the analysis up front has several advantages - We remove calls to ctx_log_* from gen_tcg_funcs.py and genptr.c - After the analysis is performed, we can initialize hex_new_value for each of the predicated assignments rather than during TCG generation for the instructions - This is a stepping stone for future work where the analysis will include the set of registers that are written. In cases where the packet doesn't have an overlap between the registers that are written and registers that are read, we can avoid the intermediate step of writing to hex_new_value. Note that other checks will also be needed (e.g., no instructions can raise an exception). Signed-off-by: Taylor Simpson --- target/hexagon/translate.h | 46 ++-- target/hexagon/genptr.c | 5 +- target/hexagon/idef-parser/parser-helpers.c | 7 +- target/hexagon/translate.c | 157 +++++++------ target/hexagon/README | 11 +- target/hexagon/gen_analyze_func_table.py | 52 +++++ target/hexagon/gen_analyze_funcs.py | 239 ++++++++++++++++++++ target/hexagon/gen_tcg_funcs.py | 23 +- target/hexagon/meson.build | 20 +- 9 files changed, 442 insertions(+), 118 deletions(-) create mode 100755 target/hexagon/gen_analyze_func_table.py create mode 100755 target/hexagon/gen_analyze_funcs.py diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h index d971f4f095..7e864b417d 100644 --- a/target/hexagon/translate.h +++ b/target/hexagon/translate.h @@ -1,5 +1,5 @@ /* - * Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved. + * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -38,6 +38,7 @@ typedef struct DisasContext { int reg_log[REG_WRITES_MAX]; int reg_log_idx; DECLARE_BITMAP(regs_written, TOTAL_PER_THREAD_REGS); + DECLARE_BITMAP(predicated_regs, TOTAL_PER_THREAD_REGS); int preg_log[PRED_WRITES_MAX]; int preg_log_idx; DECLARE_BITMAP(pregs_written, NUM_PREGS); @@ -62,32 +63,39 @@ typedef struct DisasContext { bool is_tight_loop; } DisasContext; -static inline void ctx_log_reg_write(DisasContext *ctx, int rnum) +static inline void ctx_log_pred_write(DisasContext *ctx, int pnum) { - if (test_bit(rnum, ctx->regs_written)) { - HEX_DEBUG_LOG("WARNING: Multiple writes to r%d\n", rnum); + if (!test_bit(pnum, ctx->pregs_written)) { + ctx->preg_log[ctx->preg_log_idx] = pnum; + ctx->preg_log_idx++; + set_bit(pnum, ctx->pregs_written); } - ctx->reg_log[ctx->reg_log_idx] = rnum; - ctx->reg_log_idx++; - set_bit(rnum, ctx->regs_written); -} - -static inline void ctx_log_reg_write_pair(DisasContext *ctx, int rnum) -{ - ctx_log_reg_write(ctx, rnum); - ctx_log_reg_write(ctx, rnum + 1); } -static inline void ctx_log_pred_write(DisasContext *ctx, int pnum) +static inline void ctx_log_reg_write(DisasContext *ctx, int rnum, + bool is_predicated) { - ctx->preg_log[ctx->preg_log_idx] = pnum; - ctx->preg_log_idx++; - set_bit(pnum, ctx->pregs_written); + if (rnum == HEX_REG_P3_0) { + for (int i = 0; i < NUM_PREGS; i++) { + ctx_log_pred_write(ctx, i); + } + } else { + if (!test_bit(rnum, ctx->regs_written)) { + ctx->reg_log[ctx->reg_log_idx] = rnum; + ctx->reg_log_idx++; + set_bit(rnum, ctx->regs_written); + } + if (is_predicated) { + set_bit(rnum, ctx->predicated_regs); + } + } } -static inline bool is_preloaded(DisasContext *ctx, int num) +static inline void ctx_log_reg_write_pair(DisasContext *ctx, int rnum, + bool is_predicated) { - return test_bit(num, ctx->regs_written); + ctx_log_reg_write(ctx, rnum, is_predicated); + ctx_log_reg_write(ctx, rnum + 1, is_predicated); } static inline bool is_vreg_preloaded(DisasContext *ctx, int num) diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c index 7243bb00aa..5e5ae0cdca 100644 --- a/target/hexagon/genptr.c +++ b/target/hexagon/genptr.c @@ -149,6 +149,7 @@ void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val) hex_new_pred_value[pnum], base_val); } tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << pnum); + set_bit(pnum, ctx->pregs_written); tcg_temp_free(base_val); } @@ -231,7 +232,6 @@ static void gen_write_p3_0(DisasContext *ctx, TCGv control_reg) for (int i = 0; i < NUM_PREGS; i++) { tcg_gen_extract_tl(hex_p8, control_reg, i * 8, 8); gen_log_pred_write(ctx, i, hex_p8); - ctx_log_pred_write(ctx, i); } tcg_temp_free(hex_p8); } @@ -250,7 +250,6 @@ static inline void gen_write_ctrl_reg(DisasContext *ctx, int reg_num, gen_write_p3_0(ctx, val); } else { gen_log_reg_write(reg_num, val); - ctx_log_reg_write(ctx, reg_num); if (reg_num == HEX_REG_QEMU_PKT_CNT) { ctx->num_packets = 0; } @@ -273,10 +272,8 @@ static inline void gen_write_ctrl_reg_pair(DisasContext *ctx, int reg_num, tcg_gen_extrh_i64_i32(val32, val); gen_log_reg_write(reg_num + 1, val32); tcg_temp_free(val32); - ctx_log_reg_write(ctx, reg_num + 1); } else { gen_log_reg_write_pair(reg_num, val); - ctx_log_reg_write_pair(ctx, reg_num); if (reg_num == HEX_REG_QEMU_PKT_CNT) { ctx->num_packets = 0; ctx->num_insns = 0; diff --git a/target/hexagon/idef-parser/parser-helpers.c b/target/hexagon/idef-parser/parser-helpers.c index 8110686c51..de9838806b 100644 --- a/target/hexagon/idef-parser/parser-helpers.c +++ b/target/hexagon/idef-parser/parser-helpers.c @@ -1,5 +1,5 @@ /* - * Copyright(c) 2019-2022 rev.ng Labs Srl. All Rights Reserved. + * Copyright(c) 2019-2023 rev.ng Labs Srl. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -1438,10 +1438,6 @@ void gen_write_reg(Context *c, YYLTYPE *locp, HexValue *reg, HexValue *value) locp, "gen_log_reg_write(", ®->reg.id, ", ", &value_m, ");\n"); - OUT(c, - locp, - "ctx_log_reg_write(ctx, ", ®->reg.id, - ");\n"); gen_rvalue_free(c, locp, reg); gen_rvalue_free(c, locp, &value_m); } @@ -1894,7 +1890,6 @@ void gen_pred_assign(Context *c, YYLTYPE *locp, HexValue *left_pred, if (is_direct) { OUT(c, locp, "gen_log_pred_write(ctx, ", pred_id, ", ", left_pred, ");\n"); - OUT(c, locp, "ctx_log_pred_write(ctx, ", pred_id, ");\n"); gen_rvalue_free(c, locp, left_pred); } /* Free temporary value */ diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c index 75f28e08ad..3e6e72d046 100644 --- a/target/hexagon/translate.c +++ b/target/hexagon/translate.c @@ -1,5 +1,5 @@ /* - * Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved. + * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -29,6 +29,10 @@ #include "translate.h" #include "printinsn.h" +typedef void (*AnalyzeInsn)(struct DisasContext *ctx); +#include "analyze_funcs_generated.c.inc" +#include "analyze_func_table_generated.c.inc" + TCGv hex_gpr[TOTAL_PER_THREAD_REGS]; TCGv hex_pred[NUM_PREGS]; TCGv hex_this_PC; @@ -265,6 +269,76 @@ static bool need_next_PC(DisasContext *ctx) return false; } +/* + * The opcode_analyze functions mark most of the writes in a packet + * However, there are some implicit writes marked as attributes + * of the applicable instructions. + */ +static void mark_implicit_reg_write(DisasContext *ctx, int attrib, int rnum) +{ + uint16_t opcode = ctx->insn->opcode; + if (GET_ATTRIB(opcode, attrib)) { + /* + * USR is used to set overflow and FP exceptions, + * so treat it as conditional + */ + bool is_predicated = GET_ATTRIB(opcode, A_CONDEXEC) || + rnum == HEX_REG_USR; + + /* LC0/LC1 is conditionally written by endloop instructions */ + if ((rnum == HEX_REG_LC0 || rnum == HEX_REG_LC1) && + (opcode == J2_endloop0 || + opcode == J2_endloop1 || + opcode == J2_endloop01)) { + is_predicated = true; + } + + ctx_log_reg_write(ctx, rnum, is_predicated); + } +} + +static void mark_implicit_reg_writes(DisasContext *ctx) +{ + mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_FP, HEX_REG_FP); + mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_SP, HEX_REG_SP); + mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_LR, HEX_REG_LR); + mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_LC0, HEX_REG_LC0); + mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_SA0, HEX_REG_SA0); + mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_LC1, HEX_REG_LC1); + mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_SA1, HEX_REG_SA1); + mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_USR, HEX_REG_USR); + mark_implicit_reg_write(ctx, A_FPOP, HEX_REG_USR); +} + +static void mark_implicit_pred_write(DisasContext *ctx, int attrib, int pnum) +{ + if (GET_ATTRIB(ctx->insn->opcode, attrib)) { + ctx_log_pred_write(ctx, pnum); + } +} + +static void mark_implicit_pred_writes(DisasContext *ctx) +{ + mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P0, 0); + mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P1, 1); + mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P2, 2); + mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P3, 3); +} + +static void analyze_packet(DisasContext *ctx) +{ + Packet *pkt = ctx->pkt; + for (int i = 0; i < pkt->num_insns; i++) { + Insn *insn = &pkt->insn[i]; + ctx->insn = insn; + if (opcode_analyze[insn->opcode]) { + opcode_analyze[insn->opcode](ctx); + } + mark_implicit_reg_writes(ctx); + mark_implicit_pred_writes(ctx); + } +} + static void gen_start_packet(DisasContext *ctx) { Packet *pkt = ctx->pkt; @@ -275,6 +349,7 @@ static void gen_start_packet(DisasContext *ctx) ctx->next_PC = next_PC; ctx->reg_log_idx = 0; bitmap_zero(ctx->regs_written, TOTAL_PER_THREAD_REGS); + bitmap_zero(ctx->predicated_regs, TOTAL_PER_THREAD_REGS); ctx->preg_log_idx = 0; bitmap_zero(ctx->pregs_written, NUM_PREGS); ctx->future_vregs_idx = 0; @@ -291,6 +366,14 @@ static void gen_start_packet(DisasContext *ctx) ctx->s1_store_processed = false; ctx->pre_commit = true; + analyze_packet(ctx); + + /* + * pregs_written is used both in the analyze phase as well as the code + * gen phase, so clear it again. + */ + bitmap_zero(ctx->pregs_written, NUM_PREGS); + if (HEX_DEBUG) { /* Handy place to set a breakpoint before the packet executes */ gen_helper_debug_start_packet(cpu_env); @@ -313,6 +396,16 @@ static void gen_start_packet(DisasContext *ctx) tcg_gen_movi_tl(hex_pred_written, 0); } + /* Preload the predicated registers into hex_new_value[i] */ + if (!bitmap_empty(ctx->predicated_regs, TOTAL_PER_THREAD_REGS)) { + int i = find_first_bit(ctx->predicated_regs, TOTAL_PER_THREAD_REGS); + while (i < TOTAL_PER_THREAD_REGS) { + tcg_gen_mov_tl(hex_new_value[i], hex_gpr[i]); + i = find_next_bit(ctx->predicated_regs, TOTAL_PER_THREAD_REGS, + i + 1); + } + } + if (pkt->pkt_has_hvx) { tcg_gen_movi_tl(hex_VRegs_updated, 0); tcg_gen_movi_tl(hex_QRegs_updated, 0); @@ -336,66 +429,6 @@ bool is_gather_store_insn(DisasContext *ctx) return false; } -/* - * The LOG_*_WRITE macros mark most of the writes in a packet - * However, there are some implicit writes marked as attributes - * of the applicable instructions. - */ -static void mark_implicit_reg_write(DisasContext *ctx, int attrib, int rnum) -{ - uint16_t opcode = ctx->insn->opcode; - if (GET_ATTRIB(opcode, attrib)) { - /* - * USR is used to set overflow and FP exceptions, - * so treat it as conditional - */ - bool is_predicated = GET_ATTRIB(opcode, A_CONDEXEC) || - rnum == HEX_REG_USR; - - /* LC0/LC1 is conditionally written by endloop instructions */ - if ((rnum == HEX_REG_LC0 || rnum == HEX_REG_LC1) && - (opcode == J2_endloop0 || - opcode == J2_endloop1 || - opcode == J2_endloop01)) { - is_predicated = true; - } - - if (is_predicated && !is_preloaded(ctx, rnum)) { - tcg_gen_mov_tl(hex_new_value[rnum], hex_gpr[rnum]); - } - - ctx_log_reg_write(ctx, rnum); - } -} - -static void mark_implicit_pred_write(DisasContext *ctx, int attrib, int pnum) -{ - if (GET_ATTRIB(ctx->insn->opcode, attrib)) { - ctx_log_pred_write(ctx, pnum); - } -} - -static void mark_implicit_reg_writes(DisasContext *ctx) -{ - mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_FP, HEX_REG_FP); - mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_SP, HEX_REG_SP); - mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_LR, HEX_REG_LR); - mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_LC0, HEX_REG_LC0); - mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_SA0, HEX_REG_SA0); - mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_LC1, HEX_REG_LC1); - mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_SA1, HEX_REG_SA1); - mark_implicit_reg_write(ctx, A_IMPLICIT_WRITES_USR, HEX_REG_USR); - mark_implicit_reg_write(ctx, A_FPOP, HEX_REG_USR); -} - -static void mark_implicit_pred_writes(DisasContext *ctx) -{ - mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P0, 0); - mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P1, 1); - mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P2, 2); - mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P3, 3); -} - static void mark_store_width(DisasContext *ctx) { uint16_t opcode = ctx->insn->opcode; @@ -423,9 +456,7 @@ static void mark_store_width(DisasContext *ctx) static void gen_insn(DisasContext *ctx) { if (ctx->insn->generate) { - mark_implicit_reg_writes(ctx); ctx->insn->generate(ctx); - mark_implicit_pred_writes(ctx); mark_store_width(ctx); } else { gen_exception_end_tb(ctx, HEX_EXCP_INVALID_OPCODE); diff --git a/target/hexagon/README b/target/hexagon/README index 6cb5affddb..d00c85d554 100644 --- a/target/hexagon/README +++ b/target/hexagon/README @@ -52,6 +52,8 @@ header files in /target/hexagon gen_tcg_func_table.py -> tcg_func_table_generated.c.inc gen_helper_funcs.py -> helper_funcs_generated.c.inc gen_idef_parser_funcs.py -> idef_parser_input.h + gen_analyze_funcs.py -> analyze_funcs_generated.c.inc + gen_analyze_func_table.py -> analyze_func_table.c.inc Qemu helper functions have 3 parts DEF_HELPER declaration indicates the signature of the helper @@ -87,7 +89,6 @@ tcg_funcs_generated.c.inc TCGv RtV = hex_gpr[insn->regno[2]]; gen_helper_A2_add(RdV, cpu_env, RsV, RtV); gen_log_reg_write(RdN, RdV); - ctx_log_reg_write(ctx, RdN); tcg_temp_free(RdV); } @@ -162,7 +163,6 @@ istruction. gen_helper_V6_vaddw(cpu_env, VdV, VuV, VvV, slot); tcg_temp_free(slot); gen_log_vreg_write(ctx, VdV_off, VdN, EXT_DFL, insn->slot, false); - ctx_log_vreg_write(ctx, VdN, EXT_DFL, false); tcg_temp_free_ptr(VdV); tcg_temp_free_ptr(VuV); tcg_temp_free_ptr(VvV); @@ -195,9 +195,14 @@ when the override is present. vreg_src_off(ctx, VvN); fGEN_TCG_V6_vaddw({ fHIDE(int i;) fVFOREACH(32, i) { VdV.w[i] = VuV.w[i] + VvV.w[i] ; } }); gen_log_vreg_write(ctx, VdV_off, VdN, EXT_DFL, insn->slot, false); - ctx_log_vreg_write(ctx, VdN, EXT_DFL, false); } +We also generate an analyze_ function for each instruction. Currently, +these functions record the writes to registers by calling ctx_log_*. During +gen_start_packet, we invoke the analyze_ function for each instruction in +the packet, and we mark the implicit writes. After the analysis is performed, +we initialize hex_new_value for each of the predicated assignments. + In addition to instruction semantics, we use a generator to create the decode tree. This generation is also a two step process. The first step is to run target/hexagon/gen_dectree_import.c to produce diff --git a/target/hexagon/gen_analyze_func_table.py b/target/hexagon/gen_analyze_func_table.py new file mode 100755 index 0000000000..eee394b439 --- /dev/null +++ b/target/hexagon/gen_analyze_func_table.py @@ -0,0 +1,52 @@ +#!/usr/bin/env python3 + +## +## Copyright(c) 2022-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. +## +## This program is free software; you can redistribute it and/or modify +## it under the terms of the GNU General Public License as published by +## the Free Software Foundation; either version 2 of the License, or +## (at your option) any later version. +## +## This program is distributed in the hope that it will be useful, +## but WITHOUT ANY WARRANTY; without even the implied warranty of +## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +## GNU General Public License for more details. +## +## You should have received a copy of the GNU General Public License +## along with this program; if not, see . +## + +import sys +import re +import string +import hex_common + +def main(): + hex_common.read_semantics_file(sys.argv[1]) + hex_common.read_attribs_file(sys.argv[2]) + hex_common.calculate_attribs() + tagregs = hex_common.get_tagregs() + tagimms = hex_common.get_tagimms() + + with open(sys.argv[3], 'w') as f: + f.write("#ifndef HEXAGON_ANALYZE_TABLE_H\n") + f.write("#define HEXAGON_ANALYZE_TABLE_H\n\n") + + f.write("static const AnalyzeInsn opcode_analyze[XX_LAST_OPCODE] = {\n") + for tag in hex_common.tags: + ## Skip the diag instructions + if ( tag == "Y6_diag" ) : + continue + if ( tag == "Y6_diag0" ) : + continue + if ( tag == "Y6_diag1" ) : + continue + + f.write(" [%s] = analyze_%s,\n" % (tag, tag)) + f.write("};\n\n") + + f.write("#endif /* HEXAGON_ANALYZE_TABLE_H */\n") + +if __name__ == "__main__": + main() diff --git a/target/hexagon/gen_analyze_funcs.py b/target/hexagon/gen_analyze_funcs.py new file mode 100755 index 0000000000..4358cc6ca3 --- /dev/null +++ b/target/hexagon/gen_analyze_funcs.py @@ -0,0 +1,239 @@ +#!/usr/bin/env python3 + +## +## Copyright(c) 2022-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. +## +## This program is free software; you can redistribute it and/or modify +## it under the terms of the GNU General Public License as published by +## the Free Software Foundation; either version 2 of the License, or +## (at your option) any later version. +## +## This program is distributed in the hope that it will be useful, +## but WITHOUT ANY WARRANTY; without even the implied warranty of +## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +## GNU General Public License for more details. +## +## You should have received a copy of the GNU General Public License +## along with this program; if not, see . +## + +import sys +import re +import string +import hex_common + +## +## Helpers for gen_analyze_func +## +def is_predicated(tag): + return 'A_CONDEXEC' in hex_common.attribdict[tag] + +def analyze_opn_old(f, tag, regtype, regid, regno): + regN = "%s%sN" % (regtype, regid) + predicated = "true" if is_predicated(tag) else "false" + if (regtype == "R"): + if (regid in {"ss", "tt"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + elif (regid in {"dd", "ee", "xx", "yy"}): + f.write(" const int %s = insn->regno[%d];\n" % (regN, regno)) + f.write(" ctx_log_reg_write_pair(ctx, %s, %s);\n" % \ + (regN, predicated)) + elif (regid in {"s", "t", "u", "v"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + elif (regid in {"d", "e", "x", "y"}): + f.write(" const int %s = insn->regno[%d];\n" % (regN, regno)) + f.write(" ctx_log_reg_write(ctx, %s, %s);\n" % \ + (regN, predicated)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype == "P"): + if (regid in {"s", "t", "u", "v"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + elif (regid in {"d", "e", "x"}): + f.write(" const int %s = insn->regno[%d];\n" % (regN, regno)) + f.write(" ctx_log_pred_write(ctx, %s);\n" % (regN)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype == "C"): + if (regid == "ss"): + f.write("// const int %s = insn->regno[%d] + HEX_REG_SA0;\n" % \ + (regN, regno)) + elif (regid == "dd"): + f.write(" const int %s = insn->regno[%d] + HEX_REG_SA0;\n" % \ + (regN, regno)) + f.write(" ctx_log_reg_write_pair(ctx, %s, %s);\n" % \ + (regN, predicated)) + elif (regid == "s"): + f.write("// const int %s = insn->regno[%d] + HEX_REG_SA0;\n" % \ + (regN, regno)) + elif (regid == "d"): + f.write(" const int %s = insn->regno[%d] + HEX_REG_SA0;\n" % \ + (regN, regno)) + f.write(" ctx_log_reg_write(ctx, %s, %s);\n" % \ + (regN, predicated)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype == "M"): + if (regid == "u"): + f.write("// const int %s = insn->regno[%d];\n"% \ + (regN, regno)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype == "V"): + if (regid in {"dd", "xx"}): + f.write("// const int %s = insn->regno[%d];\n" %\ + (regN, regno)) + elif (regid in {"uu", "vv"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + elif (regid in {"s", "u", "v", "w"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + elif (regid in {"d", "x", "y"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype == "Q"): + if (regid in {"d", "e", "x"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + elif (regid in {"s", "t", "u", "v"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype == "G"): + if (regid in {"dd"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + elif (regid in {"d"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + elif (regid in {"ss"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + elif (regid in {"s"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype == "S"): + if (regid in {"dd"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + elif (regid in {"d"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + elif (regid in {"ss"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + elif (regid in {"s"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + else: + print("Bad register parse: ", regtype, regid) + else: + print("Bad register parse: ", regtype, regid) + +def analyze_opn_new(f, tag, regtype, regid, regno): + regN = "%s%sN" % (regtype, regid) + if (regtype == "N"): + if (regid in {"s", "t"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype == "P"): + if (regid in {"t", "u", "v"}): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype == "O"): + if (regid == "s"): + f.write("// const int %s = insn->regno[%d];\n" % \ + (regN, regno)) + else: + print("Bad register parse: ", regtype, regid) + else: + print("Bad register parse: ", regtype, regid) + +def analyze_opn(f, tag, regtype, regid, toss, numregs, i): + if (hex_common.is_pair(regid)): + analyze_opn_old(f, tag, regtype, regid, i) + elif (hex_common.is_single(regid)): + if hex_common.is_old_val(regtype, regid, tag): + analyze_opn_old(f,tag, regtype, regid, i) + elif hex_common.is_new_val(regtype, regid, tag): + analyze_opn_new(f, tag, regtype, regid, i) + else: + print("Bad register parse: ", regtype, regid, toss, numregs) + else: + print("Bad register parse: ", regtype, regid, toss, numregs) + +## +## Generate the code to analyze the instruction +## For A2_add: Rd32=add(Rs32,Rt32), { RdV=RsV+RtV;} +## We produce: +## static void analyze_A2_add(DisasContext *ctx) +## { +## Insn *insn __attribute__((unused)) = ctx->insn; +## const int RdN = insn->regno[0]; +## ctx_log_reg_write(ctx, RdN, false); +## // const int RsN = insn->regno[1]; +## // const int RtN = insn->regno[2]; +## } +## +def gen_analyze_func(f, tag, regs, imms): + f.write("static void analyze_%s(DisasContext *ctx)\n" %tag) + f.write('{\n') + + f.write(" Insn *insn __attribute__((unused)) = ctx->insn;\n") + + i=0 + ## Analyze all the registers + for regtype, regid, toss, numregs in regs: + analyze_opn(f, tag, regtype, regid, toss, numregs, i) + i += 1 + + + f.write("}\n\n") + +def gen_def_analyze_func(f, tag, tagregs, tagimms): + regs = tagregs[tag] + imms = tagimms[tag] + + gen_analyze_func(f, tag, regs, imms) + +def main(): + hex_common.read_semantics_file(sys.argv[1]) + hex_common.read_attribs_file(sys.argv[2]) + hex_common.read_overrides_file(sys.argv[3]) + hex_common.read_overrides_file(sys.argv[4]) + hex_common.calculate_attribs() + tagregs = hex_common.get_tagregs() + tagimms = hex_common.get_tagimms() + + with open(sys.argv[5], 'w') as f: + f.write("#ifndef HEXAGON_TCG_FUNCS_H\n") + f.write("#define HEXAGON_TCG_FUNCS_H\n\n") + + for tag in hex_common.tags: + ## Skip the diag instructions + if ( tag == "Y6_diag" ) : + continue + if ( tag == "Y6_diag0" ) : + continue + if ( tag == "Y6_diag1" ) : + continue + + gen_def_analyze_func(f, tag, tagregs, tagimms) + + f.write("#endif /* HEXAGON_TCG_FUNCS_H */\n") + +if __name__ == "__main__": + main() diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py index 7e8ba17ca2..5d686ebcf1 100755 --- a/target/hexagon/gen_tcg_funcs.py +++ b/target/hexagon/gen_tcg_funcs.py @@ -1,7 +1,7 @@ #!/usr/bin/env python3 ## -## Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved. +## Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by @@ -44,15 +44,6 @@ def genptr_decl_pair_writable(f, tag, regtype, regid, regno): (regN, regno)) else: f.write(" const int %s = insn->regno[%d];\n" % (regN, regno)) - if ('A_CONDEXEC' in hex_common.attribdict[tag]): - f.write(" if (!is_preloaded(ctx, %s)) {\n" % regN) - f.write(" tcg_gen_mov_tl(hex_new_value[%s], hex_gpr[%s]);\n" % \ - (regN, regN)) - f.write(" }\n") - f.write(" if (!is_preloaded(ctx, %s + 1)) {\n" % regN) - f.write(" tcg_gen_mov_tl(hex_new_value[%s + 1], hex_gpr[%s + 1]);\n" % \ - (regN, regN)) - f.write(" }\n") def genptr_decl_writable(f, tag, regtype, regid, regno): regN="%s%sN" % (regtype,regid) @@ -63,11 +54,6 @@ def genptr_decl_writable(f, tag, regtype, regid, regno): (regN, regno)) else: f.write(" const int %s = insn->regno[%d];\n" % (regN, regno)) - if ('A_CONDEXEC' in hex_common.attribdict[tag]): - f.write(" if (!is_preloaded(ctx, %s)) {\n" % regN) - f.write(" tcg_gen_mov_tl(hex_new_value[%s], hex_gpr[%s]);\n" % \ - (regN, regN)) - f.write(" }\n") def genptr_decl(f, tag, regtype, regid, regno): regN="%s%sN" % (regtype,regid) @@ -465,8 +451,6 @@ def genptr_dst_write_pair(f, tag, regtype, regid): else: f.write(" gen_log_reg_write_pair(%s%sN, %s%sV);\n" % \ (regtype, regid, regtype, regid)) - f.write(" ctx_log_reg_write_pair(ctx, %s%sN);\n" % \ - (regtype, regid)) def genptr_dst_write(f, tag, regtype, regid): if (regtype == "R"): @@ -480,16 +464,12 @@ def genptr_dst_write(f, tag, regtype, regid): else: f.write(" gen_log_reg_write(%s%sN, %s%sV);\n" % \ (regtype, regid, regtype, regid)) - f.write(" ctx_log_reg_write(ctx, %s%sN);\n" % \ - (regtype, regid)) else: print("Bad register parse: ", regtype, regid) elif (regtype == "P"): if (regid in {"d", "e", "x"}): f.write(" gen_log_pred_write(ctx, %s%sN, %s%sV);\n" % \ (regtype, regid, regtype, regid)) - f.write(" ctx_log_pred_write(ctx, %s%sN);\n" % \ - (regtype, regid)) else: print("Bad register parse: ", regtype, regid) elif (regtype == "C"): @@ -581,7 +561,6 @@ def genptr_dst_write_opn(f,regtype, regid, tag): ## TCGv RtV = hex_gpr[insn->regno[2]]; ## ## gen_log_reg_write(RdN, RdV); -## ctx_log_reg_write(ctx, RdN); ## tcg_temp_free(RdV); ## } ## diff --git a/target/hexagon/meson.build b/target/hexagon/meson.build index e8f250fcac..ba7374f87f 100644 --- a/target/hexagon/meson.build +++ b/target/hexagon/meson.build @@ -1,5 +1,5 @@ ## -## Copyright(c) 2020-2021 Qualcomm Innovation Center, Inc. All Rights Reserved. +## Copyright(c) 2020-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by @@ -67,6 +67,24 @@ tcg_func_table_generated = custom_target( ) hexagon_ss.add(tcg_func_table_generated) +analyze_funcs_generated = custom_target( + 'analyze_funcs_generated.c.inc', + output: 'analyze_funcs_generated.c.inc', + depends: [semantics_generated], + depend_files: [hex_common_py, attribs_def, gen_tcg_h, gen_tcg_hvx_h], + command: [python, files('gen_analyze_funcs.py'), semantics_generated, attribs_def, gen_tcg_h, gen_tcg_hvx_h, '@OUTPUT@'], +) +hexagon_ss.add(analyze_funcs_generated) + +analyze_func_table_generated = custom_target( + 'analyze_func_table_generated.c.inc', + output: 'analyze_func_table_generated.c.inc', + depends: [semantics_generated], + depend_files: [hex_common_py, attribs_def], + command: [python, files('gen_analyze_func_table.py'), semantics_generated, attribs_def, '@OUTPUT@'], +) +hexagon_ss.add(analyze_func_table_generated) + printinsn_generated = custom_target( 'printinsn_generated.h.inc', output: 'printinsn_generated.h.inc', From patchwork Thu Jan 5 22:13:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 13090627 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B98C8C4708E for ; Thu, 5 Jan 2023 22:33:55 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pDYUd-00073W-Mu; Thu, 05 Jan 2023 17:13:47 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUX-0006wm-JU for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:41 -0500 Received: from mx0a-0031df01.pphosted.com ([205.220.168.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUU-00028D-Ai for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:41 -0500 Received: from pps.filterd (m0279865.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 305Ljij3019904; Thu, 5 Jan 2023 22:13:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=iCyZPr5XDOuGuuy/oZUjLF+jxLJuj1e8UhmgVx+p050=; b=RvA+ecKZS47bBBnNL1V1LHeSTOO1vI+N56SH6pv6pIChCM9/vhlJNjZmfUV/PknvB9+z uaDualknAYrcUDQQlh8p+XpuGDHm/GkekaVQBnVZTrEvFLubxNdvy8TGsXxtl+F08J1h cNF9gJKmPmEfiqN6BHnlTrrznzl6irRK89A/hlGJmcFbnKYC4oas0muDmwft2YAsCjok 4esZWpEUDG1/jJwrCok+jXuIc7gn+l5/1sNpqxtTPxrkX01FzQN5LcqxX7gn/cY8bTBw 7GHd4WR7QkcuP7QVLcHCWvDE/rHMICsOeI+kELvA6fMW4nS4CdrZp/dDIRWbt2iZ8EwU +Q== Received: from nalasppmta01.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3mwwfs97xq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 05 Jan 2023 22:13:36 +0000 Received: from pps.filterd (NALASPPMTA01.qualcomm.com [127.0.0.1]) by NALASPPMTA01.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 305MDZ9P003976; Thu, 5 Jan 2023 22:13:35 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA01.qualcomm.com (PPS) with ESMTP id 3mte5kxjvy-1; Thu, 05 Jan 2023 22:13:35 +0000 Received: from NALASPPMTA01.qualcomm.com (NALASPPMTA01.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 305MDZa0003961; Thu, 5 Jan 2023 22:13:35 GMT Received: from hu-devc-lv-u18-c.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.235.220]) by NALASPPMTA01.qualcomm.com (PPS) with ESMTP id 305MDYXa003953; Thu, 05 Jan 2023 22:13:35 +0000 Received: by hu-devc-lv-u18-c.qualcomm.com (Postfix, from userid 47164) id 4477F5000B9; Thu, 5 Jan 2023 14:13:34 -0800 (PST) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, ale@rev.ng, anjo@rev.ng, bcain@quicinc.com, quic_mathbern@quicinc.com Subject: [PATCH v3 6/9] Hexagon (target/hexagon) Analyze packet for HVX Date: Thu, 5 Jan 2023 14:13:28 -0800 Message-Id: <20230105221331.12069-7-tsimpson@quicinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230105221331.12069-1-tsimpson@quicinc.com> References: <20230105221331.12069-1-tsimpson@quicinc.com> MIME-Version: 1.0 X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: 74Q4KwIBcaTarfLFRVE0KIveZ0fj7LX5 X-Proofpoint-ORIG-GUID: 74Q4KwIBcaTarfLFRVE0KIveZ0fj7LX5 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-05_12,2023-01-05_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 phishscore=0 impostorscore=0 clxscore=1015 priorityscore=1501 mlxlogscore=636 mlxscore=0 adultscore=0 bulkscore=0 malwarescore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301050174 Received-SPF: pass client-ip=205.220.168.131; envelope-from=tsimpson@qualcomm.com; helo=mx0a-0031df01.pphosted.com X-Spam_score_int: -24 X-Spam_score: -2.5 X-Spam_bar: -- X-Spam_report: (-2.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Taylor Simpson --- target/hexagon/translate.h | 14 ++++++++------ target/hexagon/translate.c | 30 +++++++++++++++++++++++++++++ target/hexagon/gen_analyze_funcs.py | 17 +++++++++++++--- target/hexagon/gen_tcg_funcs.py | 18 ----------------- 4 files changed, 52 insertions(+), 27 deletions(-) diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h index 7e864b417d..6f456517ba 100644 --- a/target/hexagon/translate.h +++ b/target/hexagon/translate.h @@ -54,6 +54,8 @@ typedef struct DisasContext { DECLARE_BITMAP(vregs_updated_tmp, NUM_VREGS); DECLARE_BITMAP(vregs_updated, NUM_VREGS); DECLARE_BITMAP(vregs_select, NUM_VREGS); + DECLARE_BITMAP(predicated_future_vregs, NUM_VREGS); + DECLARE_BITMAP(predicated_tmp_vregs, NUM_VREGS); int qreg_log[NUM_QREGS]; bool qreg_is_predicated[NUM_QREGS]; int qreg_log_idx; @@ -98,12 +100,6 @@ static inline void ctx_log_reg_write_pair(DisasContext *ctx, int rnum, ctx_log_reg_write(ctx, rnum + 1, is_predicated); } -static inline bool is_vreg_preloaded(DisasContext *ctx, int num) -{ - return test_bit(num, ctx->vregs_updated) || - test_bit(num, ctx->vregs_updated_tmp); -} - intptr_t ctx_future_vreg_off(DisasContext *ctx, int regnum, int num, bool alloc_ok); intptr_t ctx_tmp_vreg_off(DisasContext *ctx, int regnum, @@ -119,12 +115,18 @@ static inline void ctx_log_vreg_write(DisasContext *ctx, ctx->vreg_log_idx++; set_bit(rnum, ctx->vregs_updated); + if (is_predicated) { + set_bit(rnum, ctx->predicated_future_vregs); + } } if (type == EXT_NEW) { set_bit(rnum, ctx->vregs_select); } if (type == EXT_TMP) { set_bit(rnum, ctx->vregs_updated_tmp); + if (is_predicated) { + set_bit(rnum, ctx->predicated_tmp_vregs); + } } } diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c index 3e6e72d046..4a6b52fcc5 100644 --- a/target/hexagon/translate.c +++ b/target/hexagon/translate.c @@ -358,6 +358,8 @@ static void gen_start_packet(DisasContext *ctx) bitmap_zero(ctx->vregs_updated_tmp, NUM_VREGS); bitmap_zero(ctx->vregs_updated, NUM_VREGS); bitmap_zero(ctx->vregs_select, NUM_VREGS); + bitmap_zero(ctx->predicated_future_vregs, NUM_VREGS); + bitmap_zero(ctx->predicated_tmp_vregs, NUM_VREGS); ctx->qreg_log_idx = 0; for (i = 0; i < STORES_MAX; i++) { ctx->store_width[i] = 0; @@ -406,6 +408,34 @@ static void gen_start_packet(DisasContext *ctx) } } + /* Preload the HVX registers into future_VRegs and tmp_VRegs */ + if (!bitmap_empty(ctx->predicated_future_vregs, NUM_VREGS)) { + int i = find_first_bit(ctx->predicated_future_vregs, NUM_VREGS); + while (i < NUM_VREGS) { + const intptr_t VdV_off = + ctx_future_vreg_off(ctx, i, 1, true); + intptr_t src_off = offsetof(CPUHexagonState, VRegs[i]); + tcg_gen_gvec_mov(MO_64, VdV_off, + src_off, + sizeof(MMVector), + sizeof(MMVector)); + i = find_next_bit(ctx->predicated_future_vregs, NUM_VREGS, i + 1); + } + } + if (!bitmap_empty(ctx->predicated_tmp_vregs, NUM_VREGS)) { + int i = find_first_bit(ctx->predicated_tmp_vregs, NUM_VREGS); + while (i < NUM_VREGS) { + const intptr_t VdV_off = + ctx_tmp_vreg_off(ctx, i, 1, true); + intptr_t src_off = offsetof(CPUHexagonState, VRegs[i]); + tcg_gen_gvec_mov(MO_64, VdV_off, + src_off, + sizeof(MMVector), + sizeof(MMVector)); + i = find_next_bit(ctx->predicated_tmp_vregs, NUM_VREGS, i + 1); + } + } + if (pkt->pkt_has_hvx) { tcg_gen_movi_tl(hex_VRegs_updated, 0); tcg_gen_movi_tl(hex_QRegs_updated, 0); diff --git a/target/hexagon/gen_analyze_funcs.py b/target/hexagon/gen_analyze_funcs.py index 4358cc6ca3..42b3f228ea 100755 --- a/target/hexagon/gen_analyze_funcs.py +++ b/target/hexagon/gen_analyze_funcs.py @@ -83,9 +83,16 @@ def analyze_opn_old(f, tag, regtype, regid, regno): else: print("Bad register parse: ", regtype, regid) elif (regtype == "V"): + newv = "EXT_DFL" + if (hex_common.is_new_result(tag)): + newv = "EXT_NEW" + elif (hex_common.is_tmp_result(tag)): + newv = "EXT_TMP" if (regid in {"dd", "xx"}): - f.write("// const int %s = insn->regno[%d];\n" %\ + f.write(" const int %s = insn->regno[%d];\n" %\ (regN, regno)) + f.write(" ctx_log_vreg_write_pair(ctx, %s, %s, %s);\n" % \ + (regN, newv, predicated)) elif (regid in {"uu", "vv"}): f.write("// const int %s = insn->regno[%d];\n" % \ (regN, regno)) @@ -93,14 +100,18 @@ def analyze_opn_old(f, tag, regtype, regid, regno): f.write("// const int %s = insn->regno[%d];\n" % \ (regN, regno)) elif (regid in {"d", "x", "y"}): - f.write("// const int %s = insn->regno[%d];\n" % \ + f.write(" const int %s = insn->regno[%d];\n" % \ (regN, regno)) + f.write(" ctx_log_vreg_write(ctx, %s, %s, %s);\n" % \ + (regN, newv, predicated)) else: print("Bad register parse: ", regtype, regid) elif (regtype == "Q"): if (regid in {"d", "e", "x"}): - f.write("// const int %s = insn->regno[%d];\n" % \ + f.write(" const int %s = insn->regno[%d];\n" % \ (regN, regno)) + f.write(" ctx_log_qreg_write(ctx, %s, %s);\n" % \ + (regN, predicated)) elif (regid in {"s", "t", "u", "v"}): f.write("// const int %s = insn->regno[%d];\n" % \ (regN, regno)) diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py index 5d686ebcf1..3786ede44c 100755 --- a/target/hexagon/gen_tcg_funcs.py +++ b/target/hexagon/gen_tcg_funcs.py @@ -159,17 +159,6 @@ def genptr_decl(f, tag, regtype, regid, regno): f.write(" ctx_future_vreg_off(ctx, %s%sN," % \ (regtype, regid)) f.write(" 1, true);\n"); - if 'A_CONDEXEC' in hex_common.attribdict[tag]: - f.write(" if (!is_vreg_preloaded(ctx, %s)) {\n" % (regN)) - f.write(" intptr_t src_off =") - f.write(" offsetof(CPUHexagonState, VRegs[%s%sN]);\n"% \ - (regtype, regid)) - f.write(" tcg_gen_gvec_mov(MO_64, %s%sV_off,\n" % \ - (regtype, regid)) - f.write(" src_off,\n") - f.write(" sizeof(MMVector),\n") - f.write(" sizeof(MMVector));\n") - f.write(" }\n") if (not hex_common.skip_qemu_helper(tag)): f.write(" TCGv_ptr %s%sV = tcg_temp_new_ptr();\n" % \ @@ -495,9 +484,6 @@ def genptr_dst_write_ext(f, tag, regtype, regid, newv="EXT_DFL"): (regtype, regid, regtype, regid)) f.write("%s, insn->slot, %s);\n" % \ (newv, is_predicated)) - f.write(" ctx_log_vreg_write_pair(ctx, %s%sN, %s,\n" % \ - (regtype, regid, newv)) - f.write(" %s);\n" % (is_predicated)) elif (regid in {"d", "x", "y"}): if ('A_CONDEXEC' in hex_common.attribdict[tag]): is_predicated = "true" @@ -507,8 +493,6 @@ def genptr_dst_write_ext(f, tag, regtype, regid, newv="EXT_DFL"): (regtype, regid, regtype, regid, newv)) f.write("insn->slot, %s);\n" % \ (is_predicated)) - f.write(" ctx_log_vreg_write(ctx, %s%sN, %s, %s);\n" % \ - (regtype, regid, newv, is_predicated)) else: print("Bad register parse: ", regtype, regid) elif (regtype == "Q"): @@ -520,8 +504,6 @@ def genptr_dst_write_ext(f, tag, regtype, regid, newv="EXT_DFL"): f.write(" gen_log_qreg_write(%s%sV_off, %s%sN, %s, " % \ (regtype, regid, regtype, regid, newv)) f.write("insn->slot, %s);\n" % (is_predicated)) - f.write(" ctx_log_qreg_write(ctx, %s%sN, %s);\n" % \ - (regtype, regid, is_predicated)) else: print("Bad register parse: ", regtype, regid) else: From patchwork Thu Jan 5 22:13:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 13090594 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 21234C4708E for ; Thu, 5 Jan 2023 22:13:57 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pDYUb-000714-UM; Thu, 05 Jan 2023 17:13:45 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUY-0006xi-O2 for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:42 -0500 Received: from mx0a-0031df01.pphosted.com ([205.220.168.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUU-00028F-JL for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:42 -0500 Received: from pps.filterd (m0279863.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 305M2GYG009553; Thu, 5 Jan 2023 22:13:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=D5dOBpBJxUrzbTzscFNevYsa/fPS6s0arse/NPqbhKs=; b=jO50xnIBc9FWvYlGJ8rJEkfdd0ACgEl9xMT843CadLVaLuqTqsI5oYSwnquVoE0AJRF5 aDi3EsoVLqWA7k+CcOIfzqItzvV/FQMsDS0UHJJ9cV5IIyBrTnhrhxp+UC2yO1ujM9hN q0dmJNemqDKoHnlhbTlel0/qSIV0h82Fuq7yaaI2JZnPzPljE/klOrzfAos9VpCQfb+L dgXBa4vYXkhXxft0M7VKo8ZSjEGytowPl9FZlq2b4l3pdFoK9LU0HKja3zTgUEjc+GwM DAQlNJiohPYh4gXEXo5eN/cYplXGLhF0YbH0sBtcm8zXVqA4duywq5O4z7UuphibZQyl 8Q== Received: from nalasppmta01.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3mwpwnt16r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 05 Jan 2023 22:13:36 +0000 Received: from pps.filterd (NALASPPMTA01.qualcomm.com [127.0.0.1]) by NALASPPMTA01.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 305MDZhF003977; Thu, 5 Jan 2023 22:13:35 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA01.qualcomm.com (PPS) with ESMTP id 3mte5kxjw3-1; Thu, 05 Jan 2023 22:13:35 +0000 Received: from NALASPPMTA01.qualcomm.com (NALASPPMTA01.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 305MDZkS003969; Thu, 5 Jan 2023 22:13:35 GMT Received: from hu-devc-lv-u18-c.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.235.220]) by NALASPPMTA01.qualcomm.com (PPS) with ESMTP id 305MDYir003957; Thu, 05 Jan 2023 22:13:35 +0000 Received: by hu-devc-lv-u18-c.qualcomm.com (Postfix, from userid 47164) id 47AE9500102; Thu, 5 Jan 2023 14:13:34 -0800 (PST) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, ale@rev.ng, anjo@rev.ng, bcain@quicinc.com, quic_mathbern@quicinc.com Subject: [PATCH v3 7/9] Hexagon (tests/tcg/hexagon) Update preg_alias.c Date: Thu, 5 Jan 2023 14:13:29 -0800 Message-Id: <20230105221331.12069-8-tsimpson@quicinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230105221331.12069-1-tsimpson@quicinc.com> References: <20230105221331.12069-1-tsimpson@quicinc.com> MIME-Version: 1.0 X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: 1X2NJkELaI8tCoaw9i9DKlZmkGW9SJS- X-Proofpoint-GUID: 1X2NJkELaI8tCoaw9i9DKlZmkGW9SJS- X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-05_12,2023-01-05_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 mlxlogscore=707 suspectscore=0 adultscore=0 lowpriorityscore=0 phishscore=0 spamscore=0 malwarescore=0 priorityscore=1501 bulkscore=0 impostorscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301050174 Received-SPF: pass client-ip=205.220.168.131; envelope-from=tsimpson@qualcomm.com; helo=mx0a-0031df01.pphosted.com X-Spam_score_int: -24 X-Spam_score: -2.5 X-Spam_bar: -- X-Spam_report: (-2.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Add control registers (c4, c5) to clobbers list Made possible by new toolchain container Signed-off-by: Taylor Simpson --- tests/tcg/hexagon/preg_alias.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tests/tcg/hexagon/preg_alias.c b/tests/tcg/hexagon/preg_alias.c index b44a8112b4..8798fbcaf3 100644 --- a/tests/tcg/hexagon/preg_alias.c +++ b/tests/tcg/hexagon/preg_alias.c @@ -1,5 +1,5 @@ /* - * Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved. + * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -65,7 +65,7 @@ static inline void creg_alias(int cval, PRegs *pregs) : "=r"(pregs->pregs.p0), "=r"(pregs->pregs.p1), "=r"(pregs->pregs.p2), "=r"(pregs->pregs.p3) : "r"(cval) - : "p0", "p1", "p2", "p3"); + : "c4", "p0", "p1", "p2", "p3"); } int err; @@ -92,7 +92,7 @@ static inline void creg_alias_pair(unsigned int cval, PRegs *pregs) : "=r"(pregs->pregs.p0), "=r"(pregs->pregs.p1), "=r"(pregs->pregs.p2), "=r"(pregs->pregs.p3), "=r"(c5) : "r"(cval_pair) - : "p0", "p1", "p2", "p3"); + : "c4", "c5", "p0", "p1", "p2", "p3"); check(c5, 0xdeadbeef); } @@ -117,7 +117,7 @@ static void test_packet(void) "}\n\t" : "+r"(result) : "r"(0xffffffff), "r"(0xff00ffff), "r"(0x837ed653) - : "p0", "p1", "p2", "p3"); + : "c4", "p0", "p1", "p2", "p3"); check(result, old_val); /* Test a predicated store */ @@ -129,7 +129,7 @@ static void test_packet(void) "}\n\t" : : "r"(0), "r"(0xffffffff), "r"(&result) - : "p0", "p1", "p2", "p3", "memory"); + : "c4", "p0", "p1", "p2", "p3", "memory"); check(result, 0x0); } From patchwork Thu Jan 5 22:13:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 13090628 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D1EA1C54EBC for ; Thu, 5 Jan 2023 22:33:57 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pDYUa-0006zi-NI; Thu, 05 Jan 2023 17:13:44 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUX-0006we-Qf for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:41 -0500 Received: from mx0b-0031df01.pphosted.com ([205.220.180.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUU-00028A-5o for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:41 -0500 Received: from pps.filterd (m0279872.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 305Khcka007382; Thu, 5 Jan 2023 22:13:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=Ra8/1489eqdU17G/MSzyM/vJBeqSYDlliaZQl5rkcms=; b=cIi82R3nF2pHkA18EQsYvdL4IQm47JwxniLI+DAg8eR6JpBEclxrjbVkpEjcAZ8gUBdf u0Gt9gDw+UiN+ZTnQUSqRpEi8eyxrByQaWbw0vA/DuEgcYXzZ2i2SVRYBkiWSH+KpVhO D4mE4tjYSRmCw7kexacYTIM37MJuTV2Ft7Lioxjr8Nh4cTRgzg1PKUJlBDXQ5bY+36sT Jv/px2tT2zqeGVX6I0vcDyMd8YC5mERPJW7kxBf+QnLjlB5qjryMwGazLKxFMqOfPjVd ZQwg5bdQ7yqU1I3H5Y8urhjUc7q4IHK+41QI5vgsmtZqRPI5P7Ka78LfK68dTMexvktQ Ig== Received: from nalasppmta03.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3mwvaphc4j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 05 Jan 2023 22:13:36 +0000 Received: from pps.filterd (NALASPPMTA03.qualcomm.com [127.0.0.1]) by NALASPPMTA03.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 305MDZaM009374; Thu, 5 Jan 2023 22:13:35 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA03.qualcomm.com (PPS) with ESMTP id 3mx6k603b9-1; Thu, 05 Jan 2023 22:13:35 +0000 Received: from NALASPPMTA03.qualcomm.com (NALASPPMTA03.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 305MDZC2009369; Thu, 5 Jan 2023 22:13:35 GMT Received: from hu-devc-lv-u18-c.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.235.220]) by NALASPPMTA03.qualcomm.com (PPS) with ESMTP id 305MDYqv009364; Thu, 05 Jan 2023 22:13:35 +0000 Received: by hu-devc-lv-u18-c.qualcomm.com (Postfix, from userid 47164) id 4AC7F500105; Thu, 5 Jan 2023 14:13:34 -0800 (PST) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, ale@rev.ng, anjo@rev.ng, bcain@quicinc.com, quic_mathbern@quicinc.com Subject: [PATCH v3 8/9] Hexagon (tests/tcg/hexagon) Remove __builtin from scatter_gather Date: Thu, 5 Jan 2023 14:13:30 -0800 Message-Id: <20230105221331.12069-9-tsimpson@quicinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230105221331.12069-1-tsimpson@quicinc.com> References: <20230105221331.12069-1-tsimpson@quicinc.com> MIME-Version: 1.0 X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: hDAQHVtB9VRC-qIseg_tzpJ9kYXXnmEK X-Proofpoint-ORIG-GUID: hDAQHVtB9VRC-qIseg_tzpJ9kYXXnmEK X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-05_12,2023-01-05_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 bulkscore=0 priorityscore=1501 suspectscore=0 clxscore=1015 mlxscore=0 impostorscore=0 phishscore=0 spamscore=0 mlxlogscore=978 adultscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301050174 Received-SPF: pass client-ip=205.220.180.131; envelope-from=tsimpson@qualcomm.com; helo=mx0b-0031df01.pphosted.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Replace __builtin_* with inline assembly The __builtin's are subject to change with different compiler releases, so might break Mark arrays as aligned when accessed as HVX vectors Clean up comments Signed-off-by: Taylor Simpson --- tests/tcg/hexagon/scatter_gather.c | 513 +++++++++++++++-------------- 1 file changed, 271 insertions(+), 242 deletions(-) diff --git a/tests/tcg/hexagon/scatter_gather.c b/tests/tcg/hexagon/scatter_gather.c index b93eb18133..bf8b5e0317 100644 --- a/tests/tcg/hexagon/scatter_gather.c +++ b/tests/tcg/hexagon/scatter_gather.c @@ -1,5 +1,5 @@ /* - * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved. + * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -40,47 +40,6 @@ typedef long HVX_VectorPair __attribute__((__vector_size__(256))) typedef long HVX_VectorPred __attribute__((__vector_size__(128))) __attribute__((aligned(128))); -#define VSCATTER_16(BASE, RGN, OFF, VALS) \ - __builtin_HEXAGON_V6_vscattermh_128B((int)BASE, RGN, OFF, VALS) -#define VSCATTER_16_MASKED(MASK, BASE, RGN, OFF, VALS) \ - __builtin_HEXAGON_V6_vscattermhq_128B(MASK, (int)BASE, RGN, OFF, VALS) -#define VSCATTER_32(BASE, RGN, OFF, VALS) \ - __builtin_HEXAGON_V6_vscattermw_128B((int)BASE, RGN, OFF, VALS) -#define VSCATTER_32_MASKED(MASK, BASE, RGN, OFF, VALS) \ - __builtin_HEXAGON_V6_vscattermwq_128B(MASK, (int)BASE, RGN, OFF, VALS) -#define VSCATTER_16_32(BASE, RGN, OFF, VALS) \ - __builtin_HEXAGON_V6_vscattermhw_128B((int)BASE, RGN, OFF, VALS) -#define VSCATTER_16_32_MASKED(MASK, BASE, RGN, OFF, VALS) \ - __builtin_HEXAGON_V6_vscattermhwq_128B(MASK, (int)BASE, RGN, OFF, VALS) -#define VSCATTER_16_ACC(BASE, RGN, OFF, VALS) \ - __builtin_HEXAGON_V6_vscattermh_add_128B((int)BASE, RGN, OFF, VALS) -#define VSCATTER_32_ACC(BASE, RGN, OFF, VALS) \ - __builtin_HEXAGON_V6_vscattermw_add_128B((int)BASE, RGN, OFF, VALS) -#define VSCATTER_16_32_ACC(BASE, RGN, OFF, VALS) \ - __builtin_HEXAGON_V6_vscattermhw_add_128B((int)BASE, RGN, OFF, VALS) - -#define VGATHER_16(DSTADDR, BASE, RGN, OFF) \ - __builtin_HEXAGON_V6_vgathermh_128B(DSTADDR, (int)BASE, RGN, OFF) -#define VGATHER_16_MASKED(DSTADDR, MASK, BASE, RGN, OFF) \ - __builtin_HEXAGON_V6_vgathermhq_128B(DSTADDR, MASK, (int)BASE, RGN, OFF) -#define VGATHER_32(DSTADDR, BASE, RGN, OFF) \ - __builtin_HEXAGON_V6_vgathermw_128B(DSTADDR, (int)BASE, RGN, OFF) -#define VGATHER_32_MASKED(DSTADDR, MASK, BASE, RGN, OFF) \ - __builtin_HEXAGON_V6_vgathermwq_128B(DSTADDR, MASK, (int)BASE, RGN, OFF) -#define VGATHER_16_32(DSTADDR, BASE, RGN, OFF) \ - __builtin_HEXAGON_V6_vgathermhw_128B(DSTADDR, (int)BASE, RGN, OFF) -#define VGATHER_16_32_MASKED(DSTADDR, MASK, BASE, RGN, OFF) \ - __builtin_HEXAGON_V6_vgathermhwq_128B(DSTADDR, MASK, (int)BASE, RGN, OFF) - -#define VSHUFF_H(V) \ - __builtin_HEXAGON_V6_vshuffh_128B(V) -#define VSPLAT_H(X) \ - __builtin_HEXAGON_V6_lvsplath_128B(X) -#define VAND_VAL(PRED, VAL) \ - __builtin_HEXAGON_V6_vandvrt_128B(PRED, VAL) -#define VDEAL_H(V) \ - __builtin_HEXAGON_V6_vdealh_128B(V) - int err; /* define the number of rows/cols in a square matrix */ @@ -108,22 +67,22 @@ unsigned short vscatter16_32_ref[SCATTER_BUFFER_SIZE]; unsigned short vgather16_32_ref[MATRIX_SIZE]; /* declare the arrays of offsets */ -unsigned short half_offsets[MATRIX_SIZE]; -unsigned int word_offsets[MATRIX_SIZE]; +unsigned short half_offsets[MATRIX_SIZE] __attribute__((aligned(128))); +unsigned int word_offsets[MATRIX_SIZE] __attribute__((aligned(128))); /* declare the arrays of values */ -unsigned short half_values[MATRIX_SIZE]; -unsigned short half_values_acc[MATRIX_SIZE]; -unsigned short half_values_masked[MATRIX_SIZE]; -unsigned int word_values[MATRIX_SIZE]; -unsigned int word_values_acc[MATRIX_SIZE]; -unsigned int word_values_masked[MATRIX_SIZE]; +unsigned short half_values[MATRIX_SIZE] __attribute__((aligned(128))); +unsigned short half_values_acc[MATRIX_SIZE] __attribute__((aligned(128))); +unsigned short half_values_masked[MATRIX_SIZE] __attribute__((aligned(128))); +unsigned int word_values[MATRIX_SIZE] __attribute__((aligned(128))); +unsigned int word_values_acc[MATRIX_SIZE] __attribute__((aligned(128))); +unsigned int word_values_masked[MATRIX_SIZE] __attribute__((aligned(128))); /* declare the arrays of predicates */ -unsigned short half_predicates[MATRIX_SIZE]; -unsigned int word_predicates[MATRIX_SIZE]; +unsigned short half_predicates[MATRIX_SIZE] __attribute__((aligned(128))); +unsigned int word_predicates[MATRIX_SIZE] __attribute__((aligned(128))); -/* make this big enough for all the intrinsics */ +/* make this big enough for all the operations */ const size_t region_len = sizeof(vtcm); /* optionally add sync instructions */ @@ -261,164 +220,201 @@ void create_offsets_values_preds_16_32(void) } } -/* scatter the 16 bit elements using intrinsics */ +/* scatter the 16 bit elements using HVX */ void vector_scatter_16(void) { - /* copy the offsets and values to vectors */ - HVX_Vector offsets = *(HVX_Vector *)half_offsets; - HVX_Vector values = *(HVX_Vector *)half_values; - - VSCATTER_16(&vtcm.vscatter16, region_len, offsets, values); + asm ("m0 = %1\n\t" + "v0 = vmem(%2 + #0)\n\t" + "v1 = vmem(%3 + #0)\n\t" + "vscatter(%0, m0, v0.h).h = v1\n\t" + : : "r"(vtcm.vscatter16), "r"(region_len), + "r"(half_offsets), "r"(half_values) + : "m0", "v0", "v1", "memory"); sync_scatter(vtcm.vscatter16); } -/* scatter-accumulate the 16 bit elements using intrinsics */ +/* scatter-accumulate the 16 bit elements using HVX */ void vector_scatter_16_acc(void) { - /* copy the offsets and values to vectors */ - HVX_Vector offsets = *(HVX_Vector *)half_offsets; - HVX_Vector values = *(HVX_Vector *)half_values_acc; - - VSCATTER_16_ACC(&vtcm.vscatter16, region_len, offsets, values); + asm ("m0 = %1\n\t" + "v0 = vmem(%2 + #0)\n\t" + "v1 = vmem(%3 + #0)\n\t" + "vscatter(%0, m0, v0.h).h += v1\n\t" + : : "r"(vtcm.vscatter16), "r"(region_len), + "r"(half_offsets), "r"(half_values_acc) + : "m0", "v0", "v1", "memory"); sync_scatter(vtcm.vscatter16); } -/* scatter the 16 bit elements using intrinsics */ +/* masked scatter the 16 bit elements using HVX */ void vector_scatter_16_masked(void) { - /* copy the offsets and values to vectors */ - HVX_Vector offsets = *(HVX_Vector *)half_offsets; - HVX_Vector values = *(HVX_Vector *)half_values_masked; - HVX_Vector pred_reg = *(HVX_Vector *)half_predicates; - HVX_VectorPred preds = VAND_VAL(pred_reg, ~0); - - VSCATTER_16_MASKED(preds, &vtcm.vscatter16, region_len, offsets, values); + asm ("r1 = #-1\n\t" + "v0 = vmem(%0 + #0)\n\t" + "q0 = vand(v0, r1)\n\t" + "m0 = %2\n\t" + "v0 = vmem(%3 + #0)\n\t" + "v1 = vmem(%4 + #0)\n\t" + "if (q0) vscatter(%1, m0, v0.h).h = v1\n\t" + : : "r"(half_predicates), "r"(vtcm.vscatter16), "r"(region_len), + "r"(half_offsets), "r"(half_values_masked) + : "r1", "q0", "m0", "q0", "v0", "v1", "memory"); sync_scatter(vtcm.vscatter16); } -/* scatter the 32 bit elements using intrinsics */ +/* scatter the 32 bit elements using HVX */ void vector_scatter_32(void) { - /* copy the offsets and values to vectors */ - HVX_Vector offsetslo = *(HVX_Vector *)word_offsets; - HVX_Vector offsetshi = *(HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; - HVX_Vector valueslo = *(HVX_Vector *)word_values; - HVX_Vector valueshi = *(HVX_Vector *)&word_values[MATRIX_SIZE / 2]; - - VSCATTER_32(&vtcm.vscatter32, region_len, offsetslo, valueslo); - VSCATTER_32(&vtcm.vscatter32, region_len, offsetshi, valueshi); + HVX_Vector *offsetslo = (HVX_Vector *)word_offsets; + HVX_Vector *offsetshi = (HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; + HVX_Vector *valueslo = (HVX_Vector *)word_values; + HVX_Vector *valueshi = (HVX_Vector *)&word_values[MATRIX_SIZE / 2]; + + asm ("m0 = %1\n\t" + "v0 = vmem(%2 + #0)\n\t" + "v1 = vmem(%3 + #0)\n\t" + "vscatter(%0, m0, v0.w).w = v1\n\t" + : : "r"(vtcm.vscatter32), "r"(region_len), + "r"(offsetslo), "r"(valueslo) + : "m0", "v0", "v1", "memory"); + asm ("m0 = %1\n\t" + "v0 = vmem(%2 + #0)\n\t" + "v1 = vmem(%3 + #0)\n\t" + "vscatter(%0, m0, v0.w).w = v1\n\t" + : : "r"(vtcm.vscatter32), "r"(region_len), + "r"(offsetshi), "r"(valueshi) + : "m0", "v0", "v1", "memory"); sync_scatter(vtcm.vscatter32); } -/* scatter-acc the 32 bit elements using intrinsics */ +/* scatter-accumulate the 32 bit elements using HVX */ void vector_scatter_32_acc(void) { - /* copy the offsets and values to vectors */ - HVX_Vector offsetslo = *(HVX_Vector *)word_offsets; - HVX_Vector offsetshi = *(HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; - HVX_Vector valueslo = *(HVX_Vector *)word_values_acc; - HVX_Vector valueshi = *(HVX_Vector *)&word_values_acc[MATRIX_SIZE / 2]; - - VSCATTER_32_ACC(&vtcm.vscatter32, region_len, offsetslo, valueslo); - VSCATTER_32_ACC(&vtcm.vscatter32, region_len, offsetshi, valueshi); + HVX_Vector *offsetslo = (HVX_Vector *)word_offsets; + HVX_Vector *offsetshi = (HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; + HVX_Vector *valueslo = (HVX_Vector *)word_values_acc; + HVX_Vector *valueshi = (HVX_Vector *)&word_values_acc[MATRIX_SIZE / 2]; + + asm ("m0 = %1\n\t" + "v0 = vmem(%2 + #0)\n\t" + "v1 = vmem(%3 + #0)\n\t" + "vscatter(%0, m0, v0.w).w += v1\n\t" + : : "r"(vtcm.vscatter32), "r"(region_len), + "r"(offsetslo), "r"(valueslo) + : "m0", "v0", "v1", "memory"); + asm ("m0 = %1\n\t" + "v0 = vmem(%2 + #0)\n\t" + "v1 = vmem(%3 + #0)\n\t" + "vscatter(%0, m0, v0.w).w += v1\n\t" + : : "r"(vtcm.vscatter32), "r"(region_len), + "r"(offsetshi), "r"(valueshi) + : "m0", "v0", "v1", "memory"); sync_scatter(vtcm.vscatter32); } -/* scatter the 32 bit elements using intrinsics */ +/* masked scatter the 32 bit elements using HVX */ void vector_scatter_32_masked(void) { - /* copy the offsets and values to vectors */ - HVX_Vector offsetslo = *(HVX_Vector *)word_offsets; - HVX_Vector offsetshi = *(HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; - HVX_Vector valueslo = *(HVX_Vector *)word_values_masked; - HVX_Vector valueshi = *(HVX_Vector *)&word_values_masked[MATRIX_SIZE / 2]; - HVX_Vector pred_reglo = *(HVX_Vector *)word_predicates; - HVX_Vector pred_reghi = *(HVX_Vector *)&word_predicates[MATRIX_SIZE / 2]; - HVX_VectorPred predslo = VAND_VAL(pred_reglo, ~0); - HVX_VectorPred predshi = VAND_VAL(pred_reghi, ~0); - - VSCATTER_32_MASKED(predslo, &vtcm.vscatter32, region_len, offsetslo, - valueslo); - VSCATTER_32_MASKED(predshi, &vtcm.vscatter32, region_len, offsetshi, - valueshi); + HVX_Vector *offsetslo = (HVX_Vector *)word_offsets; + HVX_Vector *offsetshi = (HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; + HVX_Vector *valueslo = (HVX_Vector *)word_values_masked; + HVX_Vector *valueshi = (HVX_Vector *)&word_values_masked[MATRIX_SIZE / 2]; + HVX_Vector *predslo = (HVX_Vector *)word_predicates; + HVX_Vector *predshi = (HVX_Vector *)&word_predicates[MATRIX_SIZE / 2]; + + asm ("r1 = #-1\n\t" + "v0 = vmem(%0 + #0)\n\t" + "q0 = vand(v0, r1)\n\t" + "m0 = %2\n\t" + "v0 = vmem(%3 + #0)\n\t" + "v1 = vmem(%4 + #0)\n\t" + "if (q0) vscatter(%1, m0, v0.w).w = v1\n\t" + : : "r"(predslo), "r"(vtcm.vscatter32), "r"(region_len), + "r"(offsetslo), "r"(valueslo) + : "r1", "q0", "m0", "q0", "v0", "v1", "memory"); + asm ("r1 = #-1\n\t" + "v0 = vmem(%0 + #0)\n\t" + "q0 = vand(v0, r1)\n\t" + "m0 = %2\n\t" + "v0 = vmem(%3 + #0)\n\t" + "v1 = vmem(%4 + #0)\n\t" + "if (q0) vscatter(%1, m0, v0.w).w = v1\n\t" + : : "r"(predshi), "r"(vtcm.vscatter32), "r"(region_len), + "r"(offsetshi), "r"(valueshi) + : "r1", "q0", "m0", "q0", "v0", "v1", "memory"); - sync_scatter(vtcm.vscatter16); + sync_scatter(vtcm.vscatter32); } -/* scatter the 16 bit elements with 32 bit offsets using intrinsics */ +/* scatter the 16 bit elements with 32 bit offsets using HVX */ void vector_scatter_16_32(void) { - HVX_VectorPair offsets; - HVX_Vector values; - - /* get the word offsets in a vector pair */ - offsets = *(HVX_VectorPair *)word_offsets; - - /* these values need to be shuffled for the scatter */ - values = *(HVX_Vector *)half_values; - values = VSHUFF_H(values); - - VSCATTER_16_32(&vtcm.vscatter16_32, region_len, offsets, values); + asm ("m0 = %1\n\t" + "v0 = vmem(%2 + #0)\n\t" + "v1 = vmem(%2 + #1)\n\t" + "v2 = vmem(%3 + #0)\n\t" + "v2.h = vshuff(v2.h)\n\t" /* shuffle the values for the scatter */ + "vscatter(%0, m0, v1:0.w).h = v2\n\t" + : : "r"(vtcm.vscatter16_32), "r"(region_len), + "r"(word_offsets), "r"(half_values) + : "m0", "v0", "v1", "v2", "memory"); sync_scatter(vtcm.vscatter16_32); } -/* scatter-acc the 16 bit elements with 32 bit offsets using intrinsics */ +/* scatter-accumulate the 16 bit elements with 32 bit offsets using HVX */ void vector_scatter_16_32_acc(void) { - HVX_VectorPair offsets; - HVX_Vector values; - - /* get the word offsets in a vector pair */ - offsets = *(HVX_VectorPair *)word_offsets; - - /* these values need to be shuffled for the scatter */ - values = *(HVX_Vector *)half_values_acc; - values = VSHUFF_H(values); - - VSCATTER_16_32_ACC(&vtcm.vscatter16_32, region_len, offsets, values); + asm ("m0 = %1\n\t" + "v0 = vmem(%2 + #0)\n\t" + "v1 = vmem(%2 + #1)\n\t" + "v2 = vmem(%3 + #0)\n\t" \ + "v2.h = vshuff(v2.h)\n\t" /* shuffle the values for the scatter */ + "vscatter(%0, m0, v1:0.w).h += v2\n\t" + : : "r"(vtcm.vscatter16_32), "r"(region_len), + "r"(word_offsets), "r"(half_values_acc) + : "m0", "v0", "v1", "v2", "memory"); sync_scatter(vtcm.vscatter16_32); } -/* masked scatter the 16 bit elements with 32 bit offsets using intrinsics */ +/* masked scatter the 16 bit elements with 32 bit offsets using HVX */ void vector_scatter_16_32_masked(void) { - HVX_VectorPair offsets; - HVX_Vector values; - HVX_Vector pred_reg; - - /* get the word offsets in a vector pair */ - offsets = *(HVX_VectorPair *)word_offsets; - - /* these values need to be shuffled for the scatter */ - values = *(HVX_Vector *)half_values_masked; - values = VSHUFF_H(values); - - pred_reg = *(HVX_Vector *)half_predicates; - pred_reg = VSHUFF_H(pred_reg); - HVX_VectorPred preds = VAND_VAL(pred_reg, ~0); - - VSCATTER_16_32_MASKED(preds, &vtcm.vscatter16_32, region_len, offsets, - values); + asm ("r1 = #-1\n\t" + "v0 = vmem(%0 + #0)\n\t" + "v0.h = vshuff(v0.h)\n\t" /* shuffle the predicates */ + "q0 = vand(v0, r1)\n\t" + "m0 = %2\n\t" + "v0 = vmem(%3 + #0)\n\t" + "v1 = vmem(%3 + #1)\n\t" + "v2 = vmem(%4 + #0)\n\t" \ + "v2.h = vshuff(v2.h)\n\t" /* shuffle the values for the scatter */ + "if (q0) vscatter(%1, m0, v1:0.w).h = v2\n\t" + : : "r"(half_predicates), "r"(vtcm.vscatter16_32), "r"(region_len), + "r"(word_offsets), "r"(half_values_masked) + : "r1", "q0", "m0", "v0", "v1", "v2", "memory"); sync_scatter(vtcm.vscatter16_32); } -/* gather the elements from the scatter16 buffer */ +/* gather the elements from the scatter16 buffer using HVX */ void vector_gather_16(void) { - HVX_Vector *vgather = (HVX_Vector *)&vtcm.vgather16; - HVX_Vector offsets = *(HVX_Vector *)half_offsets; - - VGATHER_16(vgather, &vtcm.vscatter16, region_len, offsets); - - sync_gather(vgather); + asm ("m0 = %1\n\t" + "v0 = vmem(%2 + #0)\n\t" + "{ vtmp.h = vgather(%0, m0, v0.h).h\n\t" + " vmem(%3 + #0) = vtmp.new }\n\t" + : : "r"(vtcm.vscatter16), "r"(region_len), + "r"(half_offsets), "r"(vtcm.vgather16) + : "m0", "v0", "memory"); + + sync_gather(vtcm.vgather16); } static unsigned short gather_16_masked_init(void) @@ -427,31 +423,51 @@ static unsigned short gather_16_masked_init(void) return letter | (letter << 8); } +/* masked gather the elements from the scatter16 buffer using HVX */ void vector_gather_16_masked(void) { - HVX_Vector *vgather = (HVX_Vector *)&vtcm.vgather16; - HVX_Vector offsets = *(HVX_Vector *)half_offsets; - HVX_Vector pred_reg = *(HVX_Vector *)half_predicates; - HVX_VectorPred preds = VAND_VAL(pred_reg, ~0); - - *vgather = VSPLAT_H(gather_16_masked_init()); - VGATHER_16_MASKED(vgather, preds, &vtcm.vscatter16, region_len, offsets); - - sync_gather(vgather); + unsigned short init = gather_16_masked_init(); + + asm ("v0.h = vsplat(%5)\n\t" + "vmem(%4 + #0) = v0\n\t" /* initialize the write area */ + "r1 = #-1\n\t" + "v0 = vmem(%0 + #0)\n\t" + "q0 = vand(v0, r1)\n\t" + "m0 = %2\n\t" + "v0 = vmem(%3 + #0)\n\t" + "{ if (q0) vtmp.h = vgather(%1, m0, v0.h).h\n\t" + " vmem(%4 + #0) = vtmp.new }\n\t" + : : "r"(half_predicates), "r"(vtcm.vscatter16), "r"(region_len), + "r"(half_offsets), "r"(vtcm.vgather16), "r"(init) + : "r1", "q0", "m0", "v0", "memory"); + + sync_gather(vtcm.vgather16); } -/* gather the elements from the scatter32 buffer */ +/* gather the elements from the scatter32 buffer using HVX */ void vector_gather_32(void) { - HVX_Vector *vgatherlo = (HVX_Vector *)&vtcm.vgather32; - HVX_Vector *vgatherhi = - (HVX_Vector *)((int)&vtcm.vgather32 + (MATRIX_SIZE * 2)); - HVX_Vector offsetslo = *(HVX_Vector *)word_offsets; - HVX_Vector offsetshi = *(HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; - - VGATHER_32(vgatherlo, &vtcm.vscatter32, region_len, offsetslo); - VGATHER_32(vgatherhi, &vtcm.vscatter32, region_len, offsetshi); + HVX_Vector *vgatherlo = (HVX_Vector *)vtcm.vgather32; + HVX_Vector *vgatherhi = (HVX_Vector *)&vtcm.vgather32[MATRIX_SIZE / 2]; + HVX_Vector *offsetslo = (HVX_Vector *)word_offsets; + HVX_Vector *offsetshi = (HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; + + asm ("m0 = %1\n\t" + "v0 = vmem(%2 + #0)\n\t" + "{ vtmp.w = vgather(%0, m0, v0.w).w\n\t" + " vmem(%3 + #0) = vtmp.new }\n\t" + : : "r"(vtcm.vscatter32), "r"(region_len), + "r"(offsetslo), "r"(vgatherlo) + : "m0", "v0", "memory"); + asm ("m0 = %1\n\t" + "v0 = vmem(%2 + #0)\n\t" + "{ vtmp.w = vgather(%0, m0, v0.w).w\n\t" + " vmem(%3 + #0) = vtmp.new }\n\t" + : : "r"(vtcm.vscatter32), "r"(region_len), + "r"(offsetshi), "r"(vgatherhi) + : "m0", "v0", "memory"); + sync_gather(vgatherlo); sync_gather(vgatherhi); } @@ -461,79 +477,88 @@ static unsigned int gather_32_masked_init(void) return letter | (letter << 8) | (letter << 16) | (letter << 24); } +/* masked gather the elements from the scatter32 buffer using HVX */ void vector_gather_32_masked(void) { - HVX_Vector *vgatherlo = (HVX_Vector *)&vtcm.vgather32; - HVX_Vector *vgatherhi = - (HVX_Vector *)((int)&vtcm.vgather32 + (MATRIX_SIZE * 2)); - HVX_Vector offsetslo = *(HVX_Vector *)word_offsets; - HVX_Vector offsetshi = *(HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; - HVX_Vector pred_reglo = *(HVX_Vector *)word_predicates; - HVX_VectorPred predslo = VAND_VAL(pred_reglo, ~0); - HVX_Vector pred_reghi = *(HVX_Vector *)&word_predicates[MATRIX_SIZE / 2]; - HVX_VectorPred predshi = VAND_VAL(pred_reghi, ~0); - - *vgatherlo = VSPLAT_H(gather_32_masked_init()); - *vgatherhi = VSPLAT_H(gather_32_masked_init()); - VGATHER_32_MASKED(vgatherlo, predslo, &vtcm.vscatter32, region_len, - offsetslo); - VGATHER_32_MASKED(vgatherhi, predshi, &vtcm.vscatter32, region_len, - offsetshi); + unsigned int init = gather_32_masked_init(); + HVX_Vector *vgatherlo = (HVX_Vector *)vtcm.vgather32; + HVX_Vector *vgatherhi = (HVX_Vector *)&vtcm.vgather32[MATRIX_SIZE / 2]; + HVX_Vector *offsetslo = (HVX_Vector *)word_offsets; + HVX_Vector *offsetshi = (HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; + HVX_Vector *predslo = (HVX_Vector *)word_predicates; + HVX_Vector *predshi = (HVX_Vector *)&word_predicates[MATRIX_SIZE / 2]; + + asm ("v0.h = vsplat(%5)\n\t" + "vmem(%4 + #0) = v0\n\t" /* initialize the write area */ + "r1 = #-1\n\t" + "v0 = vmem(%0 + #0)\n\t" + "q0 = vand(v0, r1)\n\t" + "m0 = %2\n\t" + "v0 = vmem(%3 + #0)\n\t" + "{ if (q0) vtmp.w = vgather(%1, m0, v0.w).w\n\t" + " vmem(%4 + #0) = vtmp.new }\n\t" + : : "r"(predslo), "r"(vtcm.vscatter32), "r"(region_len), + "r"(offsetslo), "r"(vgatherlo), "r"(init) + : "r1", "q0", "m0", "v0", "memory"); + asm ("v0.h = vsplat(%5)\n\t" + "vmem(%4 + #0) = v0\n\t" /* initialize the write area */ + "r1 = #-1\n\t" + "v0 = vmem(%0 + #0)\n\t" + "q0 = vand(v0, r1)\n\t" + "m0 = %2\n\t" + "v0 = vmem(%3 + #0)\n\t" + "{ if (q0) vtmp.w = vgather(%1, m0, v0.w).w\n\t" + " vmem(%4 + #0) = vtmp.new }\n\t" + : : "r"(predshi), "r"(vtcm.vscatter32), "r"(region_len), + "r"(offsetshi), "r"(vgatherhi), "r"(init) + : "r1", "q0", "m0", "v0", "memory"); sync_gather(vgatherlo); sync_gather(vgatherhi); } -/* gather the elements from the scatter16_32 buffer */ +/* gather the elements from the scatter16_32 buffer using HVX */ void vector_gather_16_32(void) { - HVX_Vector *vgather; - HVX_VectorPair offsets; - HVX_Vector values; - - /* get the vtcm address to gather from */ - vgather = (HVX_Vector *)&vtcm.vgather16_32; - - /* get the word offsets in a vector pair */ - offsets = *(HVX_VectorPair *)word_offsets; - - VGATHER_16_32(vgather, &vtcm.vscatter16_32, region_len, offsets); - - /* deal the elements to get the order back */ - values = *(HVX_Vector *)vgather; - values = VDEAL_H(values); - - /* write it back to vtcm address */ - *(HVX_Vector *)vgather = values; + asm ("m0 = %1\n\t" + "v0 = vmem(%2 + #0)\n\t" + "v1 = vmem(%2 + #1)\n\t" + "{ vtmp.h = vgather(%0, m0, v1:0.w).h\n\t" + " vmem(%3 + #0) = vtmp.new }\n\t" + "v0 = vmem(%3 + #0)\n\t" + "v0.h = vdeal(v0.h)\n\t" /* deal the elements to get the order back */ + "vmem(%3 + #0) = v0\n\t" + : : "r"(vtcm.vscatter16_32), "r"(region_len), + "r"(word_offsets), "r"(vtcm.vgather16_32) + : "m0", "v0", "v1", "memory"); + + sync_gather(vtcm.vgather16_32); } +/* masked gather the elements from the scatter16_32 buffer using HVX */ void vector_gather_16_32_masked(void) { - HVX_Vector *vgather; - HVX_VectorPair offsets; - HVX_Vector pred_reg; - HVX_VectorPred preds; - HVX_Vector values; - - /* get the vtcm address to gather from */ - vgather = (HVX_Vector *)&vtcm.vgather16_32; - - /* get the word offsets in a vector pair */ - offsets = *(HVX_VectorPair *)word_offsets; - pred_reg = *(HVX_Vector *)half_predicates; - pred_reg = VSHUFF_H(pred_reg); - preds = VAND_VAL(pred_reg, ~0); - - *vgather = VSPLAT_H(gather_16_masked_init()); - VGATHER_16_32_MASKED(vgather, preds, &vtcm.vscatter16_32, region_len, - offsets); - - /* deal the elements to get the order back */ - values = *(HVX_Vector *)vgather; - values = VDEAL_H(values); - - /* write it back to vtcm address */ - *(HVX_Vector *)vgather = values; + unsigned short init = gather_16_masked_init(); + + asm ("v0.h = vsplat(%5)\n\t" + "vmem(%4 + #0) = v0\n\t" /* initialize the write area */ + "r1 = #-1\n\t" + "v0 = vmem(%0 + #0)\n\t" + "v0.h = vshuff(v0.h)\n\t" /* shuffle the predicates */ + "q0 = vand(v0, r1)\n\t" + "m0 = %2\n\t" + "v0 = vmem(%3 + #0)\n\t" + "v1 = vmem(%3 + #1)\n\t" + "{ if (q0) vtmp.h = vgather(%1, m0, v1:0.w).h\n\t" + " vmem(%4 + #0) = vtmp.new }\n\t" + "v0 = vmem(%4 + #0)\n\t" + "v0.h = vdeal(v0.h)\n\t" /* deal the elements to get the order back */ + "vmem(%4 + #0) = v0\n\t" + : : "r"(half_predicates), "r"(vtcm.vscatter16_32), "r"(region_len), + "r"(word_offsets), "r"(vtcm.vgather16_32), "r"(init) + : "r1", "q0", "m0", "v0", "v1", "memory"); + + sync_gather(vtcm.vgather16_32); } static void check_buffer(const char *name, void *c, void *r, size_t size) @@ -579,6 +604,7 @@ void scalar_scatter_16_acc(unsigned short *vscatter16) } } +/* scatter-accumulate the 16 bit elements using C */ void check_scatter_16_acc() { memset(vscatter16_ref, FILL_CHAR, @@ -589,7 +615,7 @@ void check_scatter_16_acc() SCATTER_BUFFER_SIZE * sizeof(unsigned short)); } -/* scatter the 16 bit elements using C */ +/* masked scatter the 16 bit elements using C */ void scalar_scatter_16_masked(unsigned short *vscatter16) { for (int i = 0; i < MATRIX_SIZE; i++) { @@ -628,7 +654,7 @@ void check_scatter_32() SCATTER_BUFFER_SIZE * sizeof(unsigned int)); } -/* scatter the 32 bit elements using C */ +/* scatter-accumulate the 32 bit elements using C */ void scalar_scatter_32_acc(unsigned int *vscatter32) { for (int i = 0; i < MATRIX_SIZE; ++i) { @@ -646,7 +672,7 @@ void check_scatter_32_acc() SCATTER_BUFFER_SIZE * sizeof(unsigned int)); } -/* scatter the 32 bit elements using C */ +/* masked scatter the 32 bit elements using C */ void scalar_scatter_32_masked(unsigned int *vscatter32) { for (int i = 0; i < MATRIX_SIZE; i++) { @@ -667,7 +693,7 @@ void check_scatter_32_masked() SCATTER_BUFFER_SIZE * sizeof(unsigned int)); } -/* scatter the 32 bit elements using C */ +/* scatter the 16 bit elements with 32 bit offsets using C */ void scalar_scatter_16_32(unsigned short *vscatter16_32) { for (int i = 0; i < MATRIX_SIZE; ++i) { @@ -684,7 +710,7 @@ void check_scatter_16_32() SCATTER_BUFFER_SIZE * sizeof(unsigned short)); } -/* scatter the 32 bit elements using C */ +/* scatter-accumulate the 16 bit elements with 32 bit offsets using C */ void scalar_scatter_16_32_acc(unsigned short *vscatter16_32) { for (int i = 0; i < MATRIX_SIZE; ++i) { @@ -702,6 +728,7 @@ void check_scatter_16_32_acc() SCATTER_BUFFER_SIZE * sizeof(unsigned short)); } +/* masked scatter the 16 bit elements with 32 bit offsets using C */ void scalar_scatter_16_32_masked(unsigned short *vscatter16_32) { for (int i = 0; i < MATRIX_SIZE; i++) { @@ -738,6 +765,7 @@ void check_gather_16() MATRIX_SIZE * sizeof(unsigned short)); } +/* masked gather the elements from the scatter buffer using C */ void scalar_gather_16_masked(unsigned short *vgather16) { for (int i = 0; i < MATRIX_SIZE; ++i) { @@ -756,7 +784,7 @@ void check_gather_16_masked() MATRIX_SIZE * sizeof(unsigned short)); } -/* gather the elements from the scatter buffer using C */ +/* gather the elements from the scatter32 buffer using C */ void scalar_gather_32(unsigned int *vgather32) { for (int i = 0; i < MATRIX_SIZE; ++i) { @@ -772,6 +800,7 @@ void check_gather_32(void) MATRIX_SIZE * sizeof(unsigned int)); } +/* masked gather the elements from the scatter32 buffer using C */ void scalar_gather_32_masked(unsigned int *vgather32) { for (int i = 0; i < MATRIX_SIZE; ++i) { @@ -781,7 +810,6 @@ void scalar_gather_32_masked(unsigned int *vgather32) } } - void check_gather_32_masked(void) { memset(vgather32_ref, gather_32_masked_init(), @@ -791,7 +819,7 @@ void check_gather_32_masked(void) vgather32_ref, MATRIX_SIZE * sizeof(unsigned int)); } -/* gather the elements from the scatter buffer using C */ +/* gather the elements from the scatter16_32 buffer using C */ void scalar_gather_16_32(unsigned short *vgather16_32) { for (int i = 0; i < MATRIX_SIZE; ++i) { @@ -807,6 +835,7 @@ void check_gather_16_32(void) MATRIX_SIZE * sizeof(unsigned short)); } +/* masked gather the elements from the scatter16_32 buffer using C */ void scalar_gather_16_32_masked(unsigned short *vgather16_32) { for (int i = 0; i < MATRIX_SIZE; ++i) { From patchwork Thu Jan 5 22:13:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 13090635 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 13E1FC4708E for ; Thu, 5 Jan 2023 22:40:11 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pDYUc-00071o-PB; Thu, 05 Jan 2023 17:13:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUY-0006xm-Rv for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:42 -0500 Received: from mx0a-0031df01.pphosted.com ([205.220.168.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pDYUU-00028W-UB for qemu-devel@nongnu.org; Thu, 05 Jan 2023 17:13:42 -0500 Received: from pps.filterd (m0279864.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 305KOckk021321; Thu, 5 Jan 2023 22:13:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=O4IBHqwhtRzPQ8aVZT1s1W1wRgUnZr/G86NGze6Ead0=; b=SRitKlaVZaNrqDNnv5Yb8EjE/Jf44TOaWZT942/GkJs/al0YCvOOcy1C3Wo/dMi9rl1T sK6i3XjmwzDYaiqdCSKAIj7MHK7NV9K+L8FXgu7d9bdfO475SvNjAOzqoGLjYL0Ci0UQ GFjZFqWOw5dw/zDn8uWFqgB8hFobF/YoOlXBxqQ7hWqJ7awVaGFXnCCz+fXIjj6LslKQ 1H2FoLibu5NBpMcYctHkGRVVpl9PQyyBBuglCDCDKkFli6lmP5pTAMboSh5+sZIrXZek /fJ7LZBKVMzqnAtohSoIeA7tqfT5YmOhtxwTbDgsgKAtvi1KA3ArS4ZiOEXWf4G8VXmI 4A== Received: from nalasppmta02.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3mx57e07aw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 05 Jan 2023 22:13:36 +0000 Received: from pps.filterd (NALASPPMTA02.qualcomm.com [127.0.0.1]) by NALASPPMTA02.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 305MDZaY002030; Thu, 5 Jan 2023 22:13:35 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA02.qualcomm.com (PPS) with ESMTP id 3mte5kxjve-1; Thu, 05 Jan 2023 22:13:35 +0000 Received: from NALASPPMTA02.qualcomm.com (NALASPPMTA02.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 305MDZlJ002024; Thu, 5 Jan 2023 22:13:35 GMT Received: from hu-devc-lv-u18-c.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.235.220]) by NALASPPMTA02.qualcomm.com (PPS) with ESMTP id 305MDYIG002019; Thu, 05 Jan 2023 22:13:35 +0000 Received: by hu-devc-lv-u18-c.qualcomm.com (Postfix, from userid 47164) id 4E43A500116; Thu, 5 Jan 2023 14:13:34 -0800 (PST) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, ale@rev.ng, anjo@rev.ng, bcain@quicinc.com, quic_mathbern@quicinc.com Subject: [PATCH v3 9/9] Hexagon (tests/tcg/hexagon) Enable HVX tests Date: Thu, 5 Jan 2023 14:13:31 -0800 Message-Id: <20230105221331.12069-10-tsimpson@quicinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230105221331.12069-1-tsimpson@quicinc.com> References: <20230105221331.12069-1-tsimpson@quicinc.com> MIME-Version: 1.0 X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: w118u5vPOoPqAZfIMZYVJKvsldoK-R4I X-Proofpoint-ORIG-GUID: w118u5vPOoPqAZfIMZYVJKvsldoK-R4I X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2023-01-05_12,2023-01-05_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 mlxlogscore=838 phishscore=0 lowpriorityscore=0 priorityscore=1501 clxscore=1015 bulkscore=0 suspectscore=0 spamscore=0 mlxscore=0 malwarescore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2301050174 Received-SPF: pass client-ip=205.220.168.131; envelope-from=tsimpson@qualcomm.com; helo=mx0a-0031df01.pphosted.com X-Spam_score_int: -24 X-Spam_score: -2.5 X-Spam_bar: -- X-Spam_report: (-2.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Made possible by new toolchain container Signed-off-by: Taylor Simpson --- tests/tcg/hexagon/Makefile.target | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile.target index 9ee1faa1e1..adca8326bf 100644 --- a/tests/tcg/hexagon/Makefile.target +++ b/tests/tcg/hexagon/Makefile.target @@ -1,5 +1,5 @@ ## -## Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Reserved. +## Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Reserved. ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by @@ -43,6 +43,10 @@ HEX_TESTS += load_align HEX_TESTS += atomics HEX_TESTS += fpstuff HEX_TESTS += overflow +HEX_TESTS += vector_add_int +HEX_TESTS += scatter_gather +HEX_TESTS += hvx_misc +HEX_TESTS += hvx_histogram HEX_TESTS += test_abs HEX_TESTS += test_bitcnt @@ -76,3 +80,10 @@ TESTS += $(HEX_TESTS) usr: usr.c $(CC) $(CFLAGS) -mv67t -O2 -Wno-inline-asm -Wno-expansion-to-defined $< -o $@ $(LDFLAGS) +scatter_gather: CFLAGS += -mhvx +vector_add_int: CFLAGS += -mhvx -fvectorize +hvx_misc: CFLAGS += -mhvx +hvx_histogram: CFLAGS += -mhvx -Wno-gnu-folding-constant + +hvx_histogram: hvx_histogram.c hvx_histogram_row.S + $(CC) $(CFLAGS) $(CROSS_CC_GUEST_CFLAGS) $^ -o $@