From patchwork Thu Apr 27 22:40:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 13225782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DACC7C77B7C for ; Thu, 27 Apr 2023 22:42:22 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1psAIk-0001wX-Jb; Thu, 27 Apr 2023 18:41:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1psAIj-0001wO-Aa for qemu-devel@nongnu.org; Thu, 27 Apr 2023 18:41:21 -0400 Received: from mx0a-0031df01.pphosted.com ([205.220.168.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1psAIT-0005MJ-V8 for qemu-devel@nongnu.org; Thu, 27 Apr 2023 18:41:18 -0400 Received: from pps.filterd (m0279867.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33RMQScB004743; Thu, 27 Apr 2023 22:41:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=79jspRJ9gGQ8cjhus008LwfSfZPCBHgOSAyBVpEXn5o=; b=kxhhuimwJTayMFTlgFOiANeQGusWqjQml8J7iAQO1Ed4gzFsFwOGcvpGKvogiUpVRB2G TfmRhIlTruCMLZYRW69LD4Qkr4q2tF/Pf+txHG1xmVMWYPKlcx11pIsO7ZicahzrxzSx u4jERCYifEuBuRFDNtFqCVCpABBAJRW3MZZVoxyQQ7YwN4cXNLF3FZlQ7Xl/xEE2VCp2 zHKYx+2nsfiH1GtN50LhMOkRkP8P02ha8k8DJ0/GikbAEc9DyqtaLUrGZ9kYT23M91Si d6n7R9wgLscyfYuA60i0pYjRHI2izgpq5qXuHl5PtSix2WqS6g0EjMhcOPR6nH5jVqyt UA== Received: from nalasppmta04.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3q7dh2jw8d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Apr 2023 22:41:02 +0000 Received: from pps.filterd (NALASPPMTA04.qualcomm.com [127.0.0.1]) by NALASPPMTA04.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 33RMf1RM027759; Thu, 27 Apr 2023 22:41:01 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA04.qualcomm.com (PPS) with ESMTPS id 3q48nmj5f4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Apr 2023 22:41:01 +0000 Received: from NALASPPMTA04.qualcomm.com (NALASPPMTA04.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33RMf1iI027744; Thu, 27 Apr 2023 22:41:01 GMT Received: from hu-devc-sd-u20-a-1.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.204.221]) by NALASPPMTA04.qualcomm.com (PPS) with ESMTPS id 33RMf0jh027717 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Apr 2023 22:41:01 +0000 Received: by hu-devc-sd-u20-a-1.qualcomm.com (Postfix, from userid 47164) id DC1D56A2; Thu, 27 Apr 2023 15:40:59 -0700 (PDT) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, ale@rev.ng, anjo@rev.ng, bcain@quicinc.com, quic_mathbern@quicinc.com Subject: [PATCH v2 7/9] Hexagon (tests/tcg/hexagon) Add v69 HVX tests Date: Thu, 27 Apr 2023 15:40:55 -0700 Message-Id: <20230427224057.3766963-8-tsimpson@quicinc.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230427224057.3766963-1-tsimpson@quicinc.com> References: <20230427224057.3766963-1-tsimpson@quicinc.com> MIME-Version: 1.0 X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: VNF5QQmyTsdi27dguk6XqMHAIat3QDu- X-Proofpoint-GUID: VNF5QQmyTsdi27dguk6XqMHAIat3QDu- X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-27_09,2023-04-27_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 malwarescore=0 spamscore=0 adultscore=0 priorityscore=1501 phishscore=0 bulkscore=0 mlxlogscore=999 mlxscore=0 lowpriorityscore=0 impostorscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270200 Received-SPF: pass client-ip=205.220.168.131; envelope-from=tsimpson@qualcomm.com; helo=mx0a-0031df01.pphosted.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org The following instructions are tested V6_vasrvuhubrndsat V6_vasrvuhubsat V6_vasrvwuhrndsat V6_vasrvwuhsat V6_vassign_tmp V6_vcombine_tmp V6_vmpyuhvs Signed-off-by: Taylor Simpson Reviewed-by: Anton Johansson --- tests/tcg/hexagon/v69_hvx.c | 318 ++++++++++++++++++++++++++++++ tests/tcg/hexagon/Makefile.target | 3 + 2 files changed, 321 insertions(+) create mode 100644 tests/tcg/hexagon/v69_hvx.c diff --git a/tests/tcg/hexagon/v69_hvx.c b/tests/tcg/hexagon/v69_hvx.c new file mode 100644 index 0000000000..a0d567d142 --- /dev/null +++ b/tests/tcg/hexagon/v69_hvx.c @@ -0,0 +1,318 @@ +/* + * Copyright(c) 2023 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#include +#include +#include +#include +#include + +int err; + +#include "hvx_misc.h" + +#define fVROUND(VAL, SHAMT) \ + ((VAL) + (((SHAMT) > 0) ? (1LL << ((SHAMT) - 1)) : 0)) + +#define fVSATUB(VAL) \ + ((((VAL) & 0xffLL) == (VAL)) ? \ + (VAL) : \ + ((((int32_t)(VAL)) < 0) ? 0 : 0xff)) + +#define fVSATUH(VAL) \ + ((((VAL) & 0xffffLL) == (VAL)) ? \ + (VAL) : \ + ((((int32_t)(VAL)) < 0) ? 0 : 0xffff)) + +static void test_vasrvuhubrndsat(void) +{ + void *p0 = buffer0; + void *p1 = buffer1; + void *pout = output; + + memset(expect, 0xaa, sizeof(expect)); + memset(output, 0xbb, sizeof(output)); + + for (int i = 0; i < BUFSIZE / 2; i++) { + asm("v4 = vmem(%0 + #0)\n\t" + "v5 = vmem(%0 + #1)\n\t" + "v6 = vmem(%1 + #0)\n\t" + "v5.ub = vasr(v5:4.uh, v6.ub):rnd:sat\n\t" + "vmem(%2) = v5\n\t" + : : "r"(p0), "r"(p1), "r"(pout) + : "v4", "v5", "v6", "memory"); + p0 += sizeof(MMVector) * 2; + p1 += sizeof(MMVector); + pout += sizeof(MMVector); + + for (int j = 0; j < MAX_VEC_SIZE_BYTES / 2; j++) { + int shamt; + uint8_t byte0; + uint8_t byte1; + + shamt = buffer1[i].ub[2 * j + 0] & 0x7; + byte0 = fVSATUB(fVROUND(buffer0[2 * i + 0].uh[j], shamt) >> shamt); + shamt = buffer1[i].ub[2 * j + 1] & 0x7; + byte1 = fVSATUB(fVROUND(buffer0[2 * i + 1].uh[j], shamt) >> shamt); + expect[i].uh[j] = (byte1 << 8) | (byte0 & 0xff); + } + } + + check_output_h(__LINE__, BUFSIZE / 2); +} + +static void test_vasrvuhubsat(void) +{ + void *p0 = buffer0; + void *p1 = buffer1; + void *pout = output; + + memset(expect, 0xaa, sizeof(expect)); + memset(output, 0xbb, sizeof(output)); + + for (int i = 0; i < BUFSIZE / 2; i++) { + asm("v4 = vmem(%0 + #0)\n\t" + "v5 = vmem(%0 + #1)\n\t" + "v6 = vmem(%1 + #0)\n\t" + "v5.ub = vasr(v5:4.uh, v6.ub):sat\n\t" + "vmem(%2) = v5\n\t" + : : "r"(p0), "r"(p1), "r"(pout) + : "v4", "v5", "v6", "memory"); + p0 += sizeof(MMVector) * 2; + p1 += sizeof(MMVector); + pout += sizeof(MMVector); + + for (int j = 0; j < MAX_VEC_SIZE_BYTES / 2; j++) { + int shamt; + uint8_t byte0; + uint8_t byte1; + + shamt = buffer1[i].ub[2 * j + 0] & 0x7; + byte0 = fVSATUB(buffer0[2 * i + 0].uh[j] >> shamt); + shamt = buffer1[i].ub[2 * j + 1] & 0x7; + byte1 = fVSATUB(buffer0[2 * i + 1].uh[j] >> shamt); + expect[i].uh[j] = (byte1 << 8) | (byte0 & 0xff); + } + } + + check_output_h(__LINE__, BUFSIZE / 2); +} + +static void test_vasrvwuhrndsat(void) +{ + void *p0 = buffer0; + void *p1 = buffer1; + void *pout = output; + + memset(expect, 0xaa, sizeof(expect)); + memset(output, 0xbb, sizeof(output)); + + for (int i = 0; i < BUFSIZE / 2; i++) { + asm("v4 = vmem(%0 + #0)\n\t" + "v5 = vmem(%0 + #1)\n\t" + "v6 = vmem(%1 + #0)\n\t" + "v5.uh = vasr(v5:4.w, v6.uh):rnd:sat\n\t" + "vmem(%2) = v5\n\t" + : : "r"(p0), "r"(p1), "r"(pout) + : "v4", "v5", "v6", "memory"); + p0 += sizeof(MMVector) * 2; + p1 += sizeof(MMVector); + pout += sizeof(MMVector); + + for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) { + int shamt; + uint16_t half0; + uint16_t half1; + + shamt = buffer1[i].uh[2 * j + 0] & 0xf; + half0 = fVSATUH(fVROUND(buffer0[2 * i + 0].w[j], shamt) >> shamt); + shamt = buffer1[i].uh[2 * j + 1] & 0xf; + half1 = fVSATUH(fVROUND(buffer0[2 * i + 1].w[j], shamt) >> shamt); + expect[i].w[j] = (half1 << 16) | (half0 & 0xffff); + } + } + + check_output_w(__LINE__, BUFSIZE / 2); +} + +static void test_vasrvwuhsat(void) +{ + void *p0 = buffer0; + void *p1 = buffer1; + void *pout = output; + + memset(expect, 0xaa, sizeof(expect)); + memset(output, 0xbb, sizeof(output)); + + for (int i = 0; i < BUFSIZE / 2; i++) { + asm("v4 = vmem(%0 + #0)\n\t" + "v5 = vmem(%0 + #1)\n\t" + "v6 = vmem(%1 + #0)\n\t" + "v5.uh = vasr(v5:4.w, v6.uh):sat\n\t" + "vmem(%2) = v5\n\t" + : : "r"(p0), "r"(p1), "r"(pout) + : "v4", "v5", "v6", "memory"); + p0 += sizeof(MMVector) * 2; + p1 += sizeof(MMVector); + pout += sizeof(MMVector); + + for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) { + int shamt; + uint16_t half0; + uint16_t half1; + + shamt = buffer1[i].uh[2 * j + 0] & 0xf; + half0 = fVSATUH(buffer0[2 * i + 0].w[j] >> shamt); + shamt = buffer1[i].uh[2 * j + 1] & 0xf; + half1 = fVSATUH(buffer0[2 * i + 1].w[j] >> shamt); + expect[i].w[j] = (half1 << 16) | (half0 & 0xffff); + } + } + + check_output_w(__LINE__, BUFSIZE / 2); +} + +static void test_vassign_tmp(void) +{ + void *p0 = buffer0; + void *pout = output; + + memset(expect, 0xaa, sizeof(expect)); + memset(output, 0xbb, sizeof(output)); + + for (int i = 0; i < BUFSIZE; i++) { + /* + * Assign into v12 as .tmp, then use it in the next packet + * Should get the new value within the same packet and + * the old value in the next packet + */ + asm("v3 = vmem(%0 + #0)\n\t" + "r1 = #1\n\t" + "v12 = vsplat(r1)\n\t" + "r1 = #2\n\t" + "v13 = vsplat(r1)\n\t" + "{\n\t" + " v12.tmp = v13\n\t" + " v4.w = vadd(v12.w, v3.w)\n\t" + "}\n\t" + "v4.w = vadd(v4.w, v12.w)\n\t" + "vmem(%1 + #0) = v4\n\t" + : : "r"(p0), "r"(pout) + : "r1", "v3", "v4", "v12", "v13", "memory"); + p0 += sizeof(MMVector); + pout += sizeof(MMVector); + + for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) { + expect[i].w[j] = buffer0[i].w[j] + 3; + } + } + + check_output_w(__LINE__, BUFSIZE); +} + +static void test_vcombine_tmp(void) +{ + void *p0 = buffer0; + void *p1 = buffer1; + void *pout = output; + + memset(expect, 0xaa, sizeof(expect)); + memset(output, 0xbb, sizeof(output)); + + for (int i = 0; i < BUFSIZE; i++) { + /* + * Combine into v13:12 as .tmp, then use it in the next packet + * Should get the new value within the same packet and + * the old value in the next packet + */ + asm("v3 = vmem(%0 + #0)\n\t" + "r1 = #1\n\t" + "v12 = vsplat(r1)\n\t" + "r1 = #2\n\t" + "v13 = vsplat(r1)\n\t" + "r1 = #3\n\t" + "v14 = vsplat(r1)\n\t" + "r1 = #4\n\t" + "v15 = vsplat(r1)\n\t" + "{\n\t" + " v13:12.tmp = vcombine(v15, v14)\n\t" + " v4.w = vadd(v12.w, v3.w)\n\t" + " v16 = v13\n\t" + "}\n\t" + "v4.w = vadd(v4.w, v12.w)\n\t" + "v4.w = vadd(v4.w, v13.w)\n\t" + "v4.w = vadd(v4.w, v16.w)\n\t" + "vmem(%2 + #0) = v4\n\t" + : : "r"(p0), "r"(p1), "r"(pout) + : "r1", "v3", "v4", "v12", "v13", "v14", "v15", "v16", "memory"); + p0 += sizeof(MMVector); + p1 += sizeof(MMVector); + pout += sizeof(MMVector); + + for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) { + expect[i].w[j] = buffer0[i].w[j] + 10; + } + } + + check_output_w(__LINE__, BUFSIZE); +} + +static void test_vmpyuhvs(void) +{ + void *p0 = buffer0; + void *p1 = buffer1; + void *pout = output; + + memset(expect, 0xaa, sizeof(expect)); + memset(output, 0xbb, sizeof(output)); + + for (int i = 0; i < BUFSIZE; i++) { + asm("v4 = vmem(%0 + #0)\n\t" + "v5 = vmem(%1 + #0)\n\t" + "v4.uh = vmpy(V4.uh, v5.uh):>>16\n\t" + "vmem(%2) = v4\n\t" + : : "r"(p0), "r"(p1), "r"(pout) + : "v4", "v5", "memory"); + p0 += sizeof(MMVector); + p1 += sizeof(MMVector); + pout += sizeof(MMVector); + + for (int j = 0; j < MAX_VEC_SIZE_BYTES / 2; j++) { + expect[i].uh[j] = (buffer0[i].uh[j] * buffer1[i].uh[j]) >> 16; + } + } + + check_output_h(__LINE__, BUFSIZE); +} + +int main() +{ + init_buffers(); + + test_vasrvuhubrndsat(); + test_vasrvuhubsat(); + test_vasrvwuhrndsat(); + test_vasrvwuhsat(); + + test_vassign_tmp(); + test_vcombine_tmp(); + + test_vmpyuhvs(); + + puts(err ? "FAIL" : "PASS"); + return err ? 1 : 0; +} diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile.target index 2ee930cf1f..558c056148 100644 --- a/tests/tcg/hexagon/Makefile.target +++ b/tests/tcg/hexagon/Makefile.target @@ -78,6 +78,7 @@ HEX_TESTS += test_vspliceb HEX_TESTS += v68_scalar HEX_TESTS += v68_hvx +HEX_TESTS += v69_hvx TESTS += $(HEX_TESTS) @@ -95,6 +96,8 @@ hvx_misc: CFLAGS += -mhvx hvx_histogram: CFLAGS += -mhvx -Wno-gnu-folding-constant v68_hvx: v68_hvx.c hvx_misc.h v6mpy_ref.c.inc v68_hvx: CFLAGS += -mhvx -Wno-unused-function +v69_hvx: v69_hvx.c hvx_misc.h +v69_hvx: CFLAGS += -mhvx -Wno-unused-function hvx_histogram: hvx_histogram.c hvx_histogram_row.S $(CC) $(CFLAGS) $(CROSS_CC_GUEST_CFLAGS) $^ -o $@ $(LDFLAGS)