From patchwork Wed Sep 11 16:50:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 21792EE0206 for ; Wed, 11 Sep 2024 16:52:31 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYQ-0001rm-DF; Wed, 11 Sep 2024 12:50:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYO-0001ql-VA for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:52 -0400 Received: from mail-pl1-x62b.google.com ([2607:f8b0:4864:20::62b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYN-0003e3-Ed for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:52 -0400 Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-2068acc8b98so689655ad.3 for ; Wed, 11 Sep 2024 09:50:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073450; x=1726678250; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tN0OHUrqQbUaQCGulY54+8M9t2g2sAjn/2PDGh46k0c=; b=BjXJFFVXtObTeoaICNCAfGZRbuej3dwe4qRcQlq8yXhxzdovKqYpIOF9cvOz4qb1bG l3eW3zaePxM/UWFykAmTlUvosw8xC91OUQbqRcDT6MamrVvpbUYxubqiUgLzcVx8Fp/b 7ne+5Jv+ewTCoCpFrnpmB2OWBYHeP0RJcr/nuHbOhWLLImLSkJudKHYua+KdJXtPBTT8 vPDYArtG/nRXdYCCmnCRIwQa4YcW7el/qUGwf7H6xAqeEvHE/iXyCYwam7zC4gmU/6Aa jJULOM7K3E+AAgAYck3WvTQozeQG2llDNT0Kze7sYRpzBOXPKUWzZsZDd/z5OATFomIG 4iAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073450; x=1726678250; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tN0OHUrqQbUaQCGulY54+8M9t2g2sAjn/2PDGh46k0c=; b=QlwbvgBEdb5u7y7tksRFRMH9Ww2ZD/yHAdK1dqfHrircaGAWeEiB/TB7SdXzFxxya5 VDwdF6BN4bEtR8oDNP8WMnuLavQhD+fy0ZUfn8Vz/HgsWd8fzE8ylrSAWmhORHiEjqQQ BKbVUuPICh6S280/5wQHIIvJbt2UOBJ3JReZLOAweZuGGbbrRHW5heXpTk77LU5L/5Er pTYPKaPElrIVq25WUFhnTTo0dIezXvVFLjHGPqGmvYu2+szDXcQIVh36m5YLBuKkqEdJ LN85Sq+GIDqmviJsqQ8eYTxMGbuDO0QHTtLA4Or/6d+7g90x6G9G+CRw2yoKKQ7XKNfw NiDw== X-Gm-Message-State: AOJu0Yz2GsepcyHpdmHxzzCC2XxjKVS8pUfCBPjdGmXFdf7pINypxBmd n24CJ5bn7z4CLaHAulVj+0RKCVOrmeOV1Hgfw/opEsfYaB1GKQKk6D0GM9WWC6LoC+n9mDcoFCx U X-Google-Smtp-Source: AGHT+IEsbGzzYMjBi2hO0WXV5b3+LlqaGEyckmk55oYnNNiom9m72bSztJceJJDMpI86ksyi4m7TPg== X-Received: by 2002:a05:6a21:6f83:b0:1cf:47fd:50ba with SMTP id adf61e73a8af0-1cf5e17b1f8mr6059579637.37.1726073449911; Wed, 11 Sep 2024 09:50:49 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.50.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:50:49 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 01/18] tcg: Export vec_gen_6 Date: Wed, 11 Sep 2024 09:50:30 -0700 Message-ID: <20240911165047.1035764-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62b; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Add declaration to tcg-internal.h, making it available for use from tcg backend vector expanders. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/tcg-internal.h | 2 ++ tcg/tcg-op-vec.c | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h index d18f49f5d3..8099248076 100644 --- a/tcg/tcg-internal.h +++ b/tcg/tcg-internal.h @@ -102,5 +102,7 @@ TCGOp *tcg_gen_op6(TCGOpcode, TCGArg, TCGArg, TCGArg, TCGArg, TCGArg, TCGArg); void vec_gen_2(TCGOpcode, TCGType, unsigned, TCGArg, TCGArg); void vec_gen_3(TCGOpcode, TCGType, unsigned, TCGArg, TCGArg, TCGArg); void vec_gen_4(TCGOpcode, TCGType, unsigned, TCGArg, TCGArg, TCGArg, TCGArg); +void vec_gen_6(TCGOpcode opc, TCGType type, unsigned vece, TCGArg r, + TCGArg a, TCGArg b, TCGArg c, TCGArg d, TCGArg e); #endif /* TCG_INTERNAL_H */ diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index 84af210bc0..d4bb4aee74 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -172,8 +172,8 @@ void vec_gen_4(TCGOpcode opc, TCGType type, unsigned vece, op->args[3] = c; } -static void vec_gen_6(TCGOpcode opc, TCGType type, unsigned vece, TCGArg r, - TCGArg a, TCGArg b, TCGArg c, TCGArg d, TCGArg e) +void vec_gen_6(TCGOpcode opc, TCGType type, unsigned vece, TCGArg r, + TCGArg a, TCGArg b, TCGArg c, TCGArg d, TCGArg e) { TCGOp *op = tcg_emit_op(opc, 6); TCGOP_VECL(op) = type - TCG_TYPE_V64; From patchwork Wed Sep 11 16:50:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800908 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A64E7EE57C8 for ; Wed, 11 Sep 2024 16:51:49 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYR-0001xC-TX; Wed, 11 Sep 2024 12:50:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYQ-0001rn-Al for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:54 -0400 Received: from mail-pg1-x534.google.com ([2607:f8b0:4864:20::534]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYO-0003eC-8M for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:54 -0400 Received: by mail-pg1-x534.google.com with SMTP id 41be03b00d2f7-7db174f9050so30642a12.3 for ; Wed, 11 Sep 2024 09:50:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073451; x=1726678251; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=E4LfLCCfd77AE3gIUDkv8zMuzeAX8OHemsoCsp4r7PY=; b=Yn+TF8iFS0O+5Vw/6QXmmUH33fyrxlY5MKmAAd0rXtonfk0vnzDGqyIkltkk5QouN5 boYtZmp1DvIgu5mAI9MAomgpIUVOlPhDTib9dU1KJg0w7WPr5o30H//e0eK5exx18u3r 0+QDqyKv+ujBeY0QSyECAZxDKNggetU+3OWEwu0gRjsVhmeaLYYW17zc8Gjm5auzVgF2 1g0qH19rmPN9G5Gy6+Gw7n0LlAKtybIdyQ+gcpRGVkEg1B+3lHjY1U3NUzg6vuYMKLvO u5z4d8a69YaNxz9W5e0+rbOLqekxc0OZPiWhnPPyl7GCwxPtRpsuPDybomRq7w6hEOxj ojig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073451; x=1726678251; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=E4LfLCCfd77AE3gIUDkv8zMuzeAX8OHemsoCsp4r7PY=; b=d2QYW8iM4zZ+wPsxa64iApYMb1vl8mBjShbyZx6xLTve8jWOK7Zkn9+DBUoFZbDwU0 wxY6kvBz4DhGEbUJ5bi8Joq8kIpeT9GaWHiDM0ZF48NQEzh7AhaugKji3W/ZRNbESsaG 8/kZq2yPcSAPYaH2y7X0+WsJZkbW8xrbDfme14vvShhDLHsSUmh2ZWhMWCgtwA8E+RCb /qXCQxw/fRXUhlnBxiJ+ylTjn7ILCIXK0kekF9tf86nIL/qS+MSzTifVzCzJDAE5B0Yh 8zrGdOc7wdlTvxAykD6qXZ9Moy1bBQyqU0JAMbTwPPTdjch1qfyaLqMwC+jNABUx2TrF Z88A== X-Gm-Message-State: AOJu0Yw7BShFf3FJbUx7ZfJ3YEZFDEn0n3H8ZKFkDFpgq8vaAtMI7hAU 3GFeBIO/ukNkG14PWUoIGQ8lZF6ulv9TXS1x4TDo7SGolK8Keec7Y9LiEhdn92vM2tQbtx3otLk N X-Google-Smtp-Source: AGHT+IFn8LWW0mQbbQ3lncg5aU5yuUvEE2h8vNuPGqorWDXr2WYRJj8jlHkap/m4oOOiC4eA8jjAfQ== X-Received: by 2002:a05:6a20:c70b:b0:1cf:5d45:3b29 with SMTP id adf61e73a8af0-1cf5e1577cemr7790850637.35.1726073450774; Wed, 11 Sep 2024 09:50:50 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.50.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:50:50 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 02/18] tcg/i386: Split out tcg_out_vex_modrm_type Date: Wed, 11 Sep 2024 09:50:31 -0700 Message-ID: <20240911165047.1035764-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::534; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x534.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Helper function to handle setting of VEXL based on the type of the operation. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 38 +++++++++++++++----------------------- 1 file changed, 15 insertions(+), 23 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 9a54ef7f8d..af71a397b1 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -711,6 +711,15 @@ static void tcg_out_vex_modrm(TCGContext *s, int opc, int r, int v, int rm) tcg_out8(s, 0xc0 | (LOWREGMASK(r) << 3) | LOWREGMASK(rm)); } +static void tcg_out_vex_modrm_type(TCGContext *s, int opc, + int r, int v, int rm, TCGType type) +{ + if (type == TCG_TYPE_V256) { + opc |= P_VEXL; + } + tcg_out_vex_modrm(s, opc, r, v, rm); +} + /* Output an opcode with a full "rm + (index< X-Patchwork-Id: 13800916 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BF40BEE57C6 for ; Wed, 11 Sep 2024 16:52:58 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYT-00021I-6d; Wed, 11 Sep 2024 12:50:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYR-0001wJ-K7 for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:55 -0400 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYP-0003eL-Ff for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:55 -0400 Received: by mail-pf1-x42c.google.com with SMTP id d2e1a72fcca58-71911585ac4so1699794b3a.1 for ; Wed, 11 Sep 2024 09:50:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073452; x=1726678252; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tnL0glGXZj+IRh5HnNxFKu2Xl8Anqsqr5rwYjln/1tc=; b=TChzLhdXqaZINMF4UoycKvqMxVIBIZMqRlpFnRkwO4JisUzLdXlqiRSY0JSLWYEot0 gL6iKMO3t8/bnakxJ4kwzIxTAvlwWiSqyC+l6w4bXQo42G8P2U1CjgfAodUvNgxCoeF0 vTuroKrkJ3V1KivbM9vMTA3sMSxhhAvn1Z1nWLJOnWDX7NtPYkUz5f4hYxlA1YP0elB1 YnIEVzJCXzoDzMHB+KB+hPSBub4gBrPnYrzSu7wrMpiHIMAmzGjwAu7UyeoYnOZRZJx7 00N4Zn/BUzSYNapZbrBtjcI1tvdLrF6vtZa/oD9dNTpJgQW8fb1+DIyWNugO48+5+TtG q1mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073452; x=1726678252; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tnL0glGXZj+IRh5HnNxFKu2Xl8Anqsqr5rwYjln/1tc=; b=pGxIYzQ5M61JiTT7T/IaWl4f+3LT++zUYz+8+E7jkdCufr/WL9q8ircoQ94999PsJ6 9EVb5vmub4W/mfzq32w12441sd78jDIph+FgVZOdfT8Ioygidx1hE2yqmHTTABoqowKT ds9zRjqg8qSp9mRn+yJkmOt7ZRx2GMP1s8BV7ZK3bM2OsSpDgkjbkZb4X6EZHjcAugvu 7ZdE+zz8aES6H0RVUk3cvQresDJOZUIEAPxyGOlwE/fRNK6iLJfScYI6XS1JBSGrCsrr 8WuOx2XhwCK369EQ+3cPcdaikIlWmmojcoIys8lRaFrM00XrVqrHFqcIvu7rTRf97zPs Lhew== X-Gm-Message-State: AOJu0Yz7tckxQTaq5fkOSGFMgDGuPJBnorEGpp6NzrpKmom2F5lRJYPO jAiUSdow77ivbLnRcA0qc5Ol7m/fzHOK84QwZb2pdFQ8g/ffkoH2bypMzVpLat18OMVO/q5exIQ 3 X-Google-Smtp-Source: AGHT+IENDTzTQo5JQkpteKRYdKoLdI6zdDF46E3WnrjkI34guG67n1zUR+hbaUidD9WncAFS9oFjBg== X-Received: by 2002:a05:6a00:2381:b0:70d:21d9:e2ae with SMTP id d2e1a72fcca58-71926063653mr26411b3a.6.1726073451767; Wed, 11 Sep 2024 09:50:51 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.50.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:50:51 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 03/18] tcg/i386: Do not expand cmp_vec early Date: Wed, 11 Sep 2024 09:50:32 -0700 Message-ID: <20240911165047.1035764-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42c; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Move most of expansion to opcode generation, leaving the conversion of unsigned to signed to be done in the early phase. Small inefficiencies, but not incorrect results, are introduced until cmpsel_vec is converted in the next patch. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 223 +++++++++++++++++--------------------- 1 file changed, 100 insertions(+), 123 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index af71a397b1..278e567b56 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -3029,6 +3029,92 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, #undef OP_32_64 } +static int const umin_insn[4] = { + OPC_PMINUB, OPC_PMINUW, OPC_PMINUD, OPC_VPMINUQ +}; + +static int const umax_insn[4] = { + OPC_PMAXUB, OPC_PMAXUW, OPC_PMAXUD, OPC_VPMAXUQ +}; + +static bool tcg_out_cmp_vec_noinv(TCGContext *s, TCGType type, unsigned vece, + TCGReg v0, TCGReg v1, TCGReg v2, TCGCond cond) +{ + static int const cmpeq_insn[4] = { + OPC_PCMPEQB, OPC_PCMPEQW, OPC_PCMPEQD, OPC_PCMPEQQ + }; + static int const cmpgt_insn[4] = { + OPC_PCMPGTB, OPC_PCMPGTW, OPC_PCMPGTD, OPC_PCMPGTQ + }; + + enum { + NEED_INV = 1, + NEED_SWAP = 2, + NEED_UMIN = 4, + NEED_UMAX = 8, + INVALID = 16, + }; + static const uint8_t cond_fixup[16] = { + [0 ... 15] = INVALID, + [TCG_COND_EQ] = 0, + [TCG_COND_GT] = 0, + [TCG_COND_NE] = NEED_INV, + [TCG_COND_LE] = NEED_INV, + [TCG_COND_LT] = NEED_SWAP, + [TCG_COND_GE] = NEED_SWAP | NEED_INV, + [TCG_COND_LEU] = NEED_UMIN, + [TCG_COND_GTU] = NEED_UMIN | NEED_INV, + [TCG_COND_GEU] = NEED_UMAX, + [TCG_COND_LTU] = NEED_UMAX | NEED_INV, + }; + int fixup = cond_fixup[cond]; + + assert(!(fixup & INVALID)); + + if (fixup & NEED_INV) { + cond = tcg_invert_cond(cond); + } + + if (fixup & NEED_SWAP) { + TCGReg swap = v1; + v1 = v2; + v2 = swap; + cond = tcg_swap_cond(cond); + } + + if (fixup & (NEED_UMIN | NEED_UMAX)) { + int op = (fixup & NEED_UMIN ? umin_insn[vece] : umax_insn[vece]); + + /* avx2 does not have 64-bit min/max; adjusted during expand. */ + assert(vece <= MO_32); + + tcg_out_vex_modrm_type(s, op, TCG_TMP_VEC, v1, v2, type); + v2 = TCG_TMP_VEC; + cond = TCG_COND_EQ; + } + + switch (cond) { + case TCG_COND_EQ: + tcg_out_vex_modrm_type(s, cmpeq_insn[vece], v0, v1, v2, type); + break; + case TCG_COND_GT: + tcg_out_vex_modrm_type(s, cmpgt_insn[vece], v0, v1, v2, type); + break; + default: + g_assert_not_reached(); + } + return fixup & NEED_INV; +} + +static void tcg_out_cmp_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg v0, TCGReg v1, TCGReg v2, TCGCond cond) +{ + if (tcg_out_cmp_vec_noinv(s, type, vece, v0, v1, v2, cond)) { + tcg_out_dupi_vec(s, type, vece, TCG_TMP_VEC, -1); + tcg_out_vex_modrm_type(s, OPC_PXOR, v0, v0, TCG_TMP_VEC, type); + } +} + static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, unsigned vece, const TCGArg args[TCG_MAX_OP_ARGS], @@ -3058,12 +3144,6 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, static int const shift_imm_insn[4] = { OPC_UD2, OPC_PSHIFTW_Ib, OPC_PSHIFTD_Ib, OPC_PSHIFTQ_Ib }; - static int const cmpeq_insn[4] = { - OPC_PCMPEQB, OPC_PCMPEQW, OPC_PCMPEQD, OPC_PCMPEQQ - }; - static int const cmpgt_insn[4] = { - OPC_PCMPGTB, OPC_PCMPGTW, OPC_PCMPGTD, OPC_PCMPGTQ - }; static int const punpckl_insn[4] = { OPC_PUNPCKLBW, OPC_PUNPCKLWD, OPC_PUNPCKLDQ, OPC_PUNPCKLQDQ }; @@ -3082,12 +3162,6 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, static int const smax_insn[4] = { OPC_PMAXSB, OPC_PMAXSW, OPC_PMAXSD, OPC_VPMAXSQ }; - static int const umin_insn[4] = { - OPC_PMINUB, OPC_PMINUW, OPC_PMINUD, OPC_VPMINUQ - }; - static int const umax_insn[4] = { - OPC_PMAXUB, OPC_PMAXUW, OPC_PMAXUD, OPC_VPMAXUQ - }; static int const rotlv_insn[4] = { OPC_UD2, OPC_UD2, OPC_VPROLVD, OPC_VPROLVQ }; @@ -3243,15 +3317,8 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, break; case INDEX_op_cmp_vec: - sub = args[3]; - if (sub == TCG_COND_EQ) { - insn = cmpeq_insn[vece]; - } else if (sub == TCG_COND_GT) { - insn = cmpgt_insn[vece]; - } else { - g_assert_not_reached(); - } - goto gen_simd; + tcg_out_cmp_vec(s, type, vece, a0, a1, a2, args[3]); + break; case INDEX_op_andc_vec: insn = OPC_PANDN; @@ -3971,88 +4038,19 @@ static void expand_vec_mul(TCGType type, unsigned vece, } } -static bool expand_vec_cmp_noinv(TCGType type, unsigned vece, TCGv_vec v0, - TCGv_vec v1, TCGv_vec v2, TCGCond cond) +static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, + TCGv_vec v1, TCGv_vec v2, TCGCond cond) { - enum { - NEED_INV = 1, - NEED_SWAP = 2, - NEED_BIAS = 4, - NEED_UMIN = 8, - NEED_UMAX = 16, - }; - TCGv_vec t1, t2, t3; - uint8_t fixup; + /* + * Without AVX512, there are no 64-bit unsigned comparisons. + * We must bias the inputs so that they become signed. + * All other swapping and inversion are handled during code generation. + */ + if (vece == MO_64 && is_unsigned_cond(cond)) { + TCGv_vec t1 = tcg_temp_new_vec(type); + TCGv_vec t2 = tcg_temp_new_vec(type); + TCGv_vec t3 = tcg_constant_vec(type, vece, 1ull << ((8 << vece) - 1)); - switch (cond) { - case TCG_COND_EQ: - case TCG_COND_GT: - fixup = 0; - break; - case TCG_COND_NE: - case TCG_COND_LE: - fixup = NEED_INV; - break; - case TCG_COND_LT: - fixup = NEED_SWAP; - break; - case TCG_COND_GE: - fixup = NEED_SWAP | NEED_INV; - break; - case TCG_COND_LEU: - if (tcg_can_emit_vec_op(INDEX_op_umin_vec, type, vece)) { - fixup = NEED_UMIN; - } else { - fixup = NEED_BIAS | NEED_INV; - } - break; - case TCG_COND_GTU: - if (tcg_can_emit_vec_op(INDEX_op_umin_vec, type, vece)) { - fixup = NEED_UMIN | NEED_INV; - } else { - fixup = NEED_BIAS; - } - break; - case TCG_COND_GEU: - if (tcg_can_emit_vec_op(INDEX_op_umax_vec, type, vece)) { - fixup = NEED_UMAX; - } else { - fixup = NEED_BIAS | NEED_SWAP | NEED_INV; - } - break; - case TCG_COND_LTU: - if (tcg_can_emit_vec_op(INDEX_op_umax_vec, type, vece)) { - fixup = NEED_UMAX | NEED_INV; - } else { - fixup = NEED_BIAS | NEED_SWAP; - } - break; - default: - g_assert_not_reached(); - } - - if (fixup & NEED_INV) { - cond = tcg_invert_cond(cond); - } - if (fixup & NEED_SWAP) { - t1 = v1, v1 = v2, v2 = t1; - cond = tcg_swap_cond(cond); - } - - t1 = t2 = NULL; - if (fixup & (NEED_UMIN | NEED_UMAX)) { - t1 = tcg_temp_new_vec(type); - if (fixup & NEED_UMIN) { - tcg_gen_umin_vec(vece, t1, v1, v2); - } else { - tcg_gen_umax_vec(vece, t1, v1, v2); - } - v2 = t1; - cond = TCG_COND_EQ; - } else if (fixup & NEED_BIAS) { - t1 = tcg_temp_new_vec(type); - t2 = tcg_temp_new_vec(type); - t3 = tcg_constant_vec(type, vece, 1ull << ((8 << vece) - 1)); tcg_gen_sub_vec(vece, t1, v1, t3); tcg_gen_sub_vec(vece, t2, v2, t3); v1 = t1; @@ -4060,26 +4058,9 @@ static bool expand_vec_cmp_noinv(TCGType type, unsigned vece, TCGv_vec v0, cond = tcg_signed_cond(cond); } - tcg_debug_assert(cond == TCG_COND_EQ || cond == TCG_COND_GT); /* Expand directly; do not recurse. */ vec_gen_4(INDEX_op_cmp_vec, type, vece, tcgv_vec_arg(v0), tcgv_vec_arg(v1), tcgv_vec_arg(v2), cond); - - if (t1) { - tcg_temp_free_vec(t1); - if (t2) { - tcg_temp_free_vec(t2); - } - } - return fixup & NEED_INV; -} - -static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, - TCGv_vec v1, TCGv_vec v2, TCGCond cond) -{ - if (expand_vec_cmp_noinv(type, vece, v0, v1, v2, cond)) { - tcg_gen_not_vec(vece, v0, v0); - } } static void expand_vec_cmpsel(TCGType type, unsigned vece, TCGv_vec v0, @@ -4088,11 +4069,7 @@ static void expand_vec_cmpsel(TCGType type, unsigned vece, TCGv_vec v0, { TCGv_vec t = tcg_temp_new_vec(type); - if (expand_vec_cmp_noinv(type, vece, t, c1, c2, cond)) { - /* Invert the sense of the compare by swapping arguments. */ - TCGv_vec x; - x = v3, v3 = v4, v4 = x; - } + expand_vec_cmp(type, vece, t, c1, c2, cond); vec_gen_4(INDEX_op_x86_vpblendvb_vec, type, vece, tcgv_vec_arg(v0), tcgv_vec_arg(v4), tcgv_vec_arg(v3), tcgv_vec_arg(t)); From patchwork Wed Sep 11 16:50:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 22CFFEE0206 for ; Wed, 11 Sep 2024 16:53:11 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYT-00023z-Ul; Wed, 11 Sep 2024 12:50:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYS-0001ys-Dt for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:56 -0400 Received: from mail-pf1-x42a.google.com ([2607:f8b0:4864:20::42a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYQ-0003ee-Fx for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:56 -0400 Received: by mail-pf1-x42a.google.com with SMTP id d2e1a72fcca58-7179802b8fcso5116391b3a.1 for ; Wed, 11 Sep 2024 09:50:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073453; x=1726678253; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/08vmd3pJmcPxGI9Dvd5g1DO6IpTA5tvF/u1ewyB0/g=; b=WRCrZvKoSYjDJnX3rABfvN7EJZExFB2uFGCgDuPIK21utUCYXVo0LURGvbKoLdTfaL NBLKP5+IMJ8C8h2dPGf5t9sEl6tUu4dtOVoU93hMcN5ur1BRFk+6UOyR6Re87vDZP2Vd VjuSaxabRO2USN1ip6TfLU1YuscGhbms5Wykfed4e/PLq+WMv8s3Y+Gc9Svpy3e5MSeQ In94KYTEdnN2E5ByGG+N+8888vfJLPQiU66V9ASqkNNeCGm8TK5vfiGOXo4maKrZF4E3 dn/37xxuchnecQN881wlOhpA8NYWmqn+JnjytO4oHwycFx0LRL6BmKKV+8jcG7j6ZUOx k08g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073453; x=1726678253; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/08vmd3pJmcPxGI9Dvd5g1DO6IpTA5tvF/u1ewyB0/g=; b=pQcydhFcu5V0ofYoWny9/Y5QlZ4xCMekSTENTqJxKpyUQYOv3S6cIWN+dLdWk+jH/k kyBex1TYHzxKXXVE/io+67Jv6Bz5jr50MvYCmdV0efLP6mX6IoWvgowQsBJdyrGGuTNR qP02bbzcssLlC07QwH5Y4EJiq0vM3EiExNEI9XaPphBlj3VJjOUuqdQsP9mLO6itL9Hf nQ7oL+1iLkAwuRNH0wBH3tyCqIYKueToKMVrljzipd85IqvQEdKcP2CXM2cEANczkqdn WPxHQNLX2jsToPYEL5SE0oE7C3bUPlWMaC4PVoH/0NEtO3jReuPrFAc4YgRbKThquV/u kfZw== X-Gm-Message-State: AOJu0Yw4Z2B2vY3zaY2g0I9mU1fjvhoeaxnK8fCy4ysfFxrm9rCto+jN 8VHQQBTEq6i0iEnllubqLS9UdsHoyJHMAS9ZzxtUVLncp4DP4Gr4oIXCcV981Dz6DXBvE0w2wgm S X-Google-Smtp-Source: AGHT+IEq0YMhjDuQJVwM+fn0GQGP+zOOtYW9EVYQUYe7/8O7hSViZUsE3Ei43Tt7tPlQEhd0RrPA1Q== X-Received: by 2002:a05:6a20:cd0e:b0:1cf:3a2c:e3fe with SMTP id adf61e73a8af0-1cf5e199538mr7264012637.32.1726073452805; Wed, 11 Sep 2024 09:50:52 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.50.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:50:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 04/18] tcg/i386: Do not expand cmpsel_vec early Date: Wed, 11 Sep 2024 09:50:33 -0700 Message-ID: <20240911165047.1035764-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42a; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Expand during output instead of during opcode generation. Remove x86_vpblendvb_vec opcode, this this removes the only user. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target-con-set.h | 1 + tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.opc.h | 1 - tcg/i386/tcg-target.c.inc | 84 +++++++++++++++++++++-------------- 4 files changed, 53 insertions(+), 35 deletions(-) diff --git a/tcg/i386/tcg-target-con-set.h b/tcg/i386/tcg-target-con-set.h index e24241cfa2..da4411d96b 100644 --- a/tcg/i386/tcg-target-con-set.h +++ b/tcg/i386/tcg-target-con-set.h @@ -50,6 +50,7 @@ C_N1_I2(r, r, r) C_N1_I2(r, r, rW) C_O1_I3(x, 0, x, x) C_O1_I3(x, x, x, x) +C_O1_I4(x, x, x, x, x) C_O1_I4(r, r, reT, r, 0) C_O1_I4(r, r, r, ri, ri) C_O2_I1(r, r, L) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 2f67a97e05..342be30c4c 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -223,7 +223,7 @@ typedef enum { #define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 1 #define TCG_TARGET_HAS_bitsel_vec have_avx512vl -#define TCG_TARGET_HAS_cmpsel_vec -1 +#define TCG_TARGET_HAS_cmpsel_vec 1 #define TCG_TARGET_HAS_tst_vec 0 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ diff --git a/tcg/i386/tcg-target.opc.h b/tcg/i386/tcg-target.opc.h index b5f403e35e..4ffc084bda 100644 --- a/tcg/i386/tcg-target.opc.h +++ b/tcg/i386/tcg-target.opc.h @@ -25,7 +25,6 @@ */ DEF(x86_shufps_vec, 1, 2, 1, IMPLVEC) -DEF(x86_vpblendvb_vec, 1, 3, 0, IMPLVEC) DEF(x86_blend_vec, 1, 2, 1, IMPLVEC) DEF(x86_packss_vec, 1, 2, 0, IMPLVEC) DEF(x86_packus_vec, 1, 2, 0, IMPLVEC) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 278e567b56..a04dc7d270 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -3115,6 +3115,19 @@ static void tcg_out_cmp_vec(TCGContext *s, TCGType type, unsigned vece, } } +static void tcg_out_cmpsel_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg v0, TCGReg c1, TCGReg c2, + TCGReg v3, TCGReg v4, TCGCond cond) +{ + if (tcg_out_cmp_vec_noinv(s, type, vece, TCG_TMP_VEC, c1, c2, cond)) { + TCGReg swap = v3; + v3 = v4; + v4 = swap; + } + tcg_out_vex_modrm_type(s, OPC_VPBLENDVB, v0, v4, v3, type); + tcg_out8(s, (TCG_TMP_VEC - TCG_REG_XMM0) << 4); +} + static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, unsigned vece, const TCGArg args[TCG_MAX_OP_ARGS], @@ -3320,6 +3333,11 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, tcg_out_cmp_vec(s, type, vece, a0, a1, a2, args[3]); break; + case INDEX_op_cmpsel_vec: + tcg_out_cmpsel_vec(s, type, vece, a0, a1, a2, + args[3], args[4], args[5]); + break; + case INDEX_op_andc_vec: insn = OPC_PANDN; tcg_out_vex_modrm_type(s, insn, a0, a2, a1, type); @@ -3431,11 +3449,6 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, tcg_out8(s, sub); break; - case INDEX_op_x86_vpblendvb_vec: - tcg_out_vex_modrm_type(s, OPC_VPBLENDVB, a0, a1, a2, type); - tcg_out8(s, args[3] << 4); - break; - case INDEX_op_x86_psrldq_vec: tcg_out_vex_modrm(s, OPC_GRP14, 3, a0, a1); tcg_out8(s, a2); @@ -3701,8 +3714,9 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) return C_O1_I3(x, 0, x, x); case INDEX_op_bitsel_vec: - case INDEX_op_x86_vpblendvb_vec: return C_O1_I3(x, x, x, x); + case INDEX_op_cmpsel_vec: + return C_O1_I4(x, x, x, x, x); default: g_assert_not_reached(); @@ -4038,8 +4052,8 @@ static void expand_vec_mul(TCGType type, unsigned vece, } } -static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, - TCGv_vec v1, TCGv_vec v2, TCGCond cond) +static TCGCond expand_vec_cond(TCGType type, unsigned vece, + TCGArg *a1, TCGArg *a2, TCGCond cond) { /* * Without AVX512, there are no 64-bit unsigned comparisons. @@ -4047,46 +4061,50 @@ static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, * All other swapping and inversion are handled during code generation. */ if (vece == MO_64 && is_unsigned_cond(cond)) { + TCGv_vec v1 = temp_tcgv_vec(arg_temp(*a1)); + TCGv_vec v2 = temp_tcgv_vec(arg_temp(*a2)); TCGv_vec t1 = tcg_temp_new_vec(type); TCGv_vec t2 = tcg_temp_new_vec(type); TCGv_vec t3 = tcg_constant_vec(type, vece, 1ull << ((8 << vece) - 1)); tcg_gen_sub_vec(vece, t1, v1, t3); tcg_gen_sub_vec(vece, t2, v2, t3); - v1 = t1; - v2 = t2; + *a1 = tcgv_vec_arg(t1); + *a2 = tcgv_vec_arg(t2); cond = tcg_signed_cond(cond); } - - /* Expand directly; do not recurse. */ - vec_gen_4(INDEX_op_cmp_vec, type, vece, - tcgv_vec_arg(v0), tcgv_vec_arg(v1), tcgv_vec_arg(v2), cond); + return cond; } -static void expand_vec_cmpsel(TCGType type, unsigned vece, TCGv_vec v0, - TCGv_vec c1, TCGv_vec c2, - TCGv_vec v3, TCGv_vec v4, TCGCond cond) +static void expand_vec_cmp(TCGType type, unsigned vece, TCGArg a0, + TCGArg a1, TCGArg a2, TCGCond cond) { - TCGv_vec t = tcg_temp_new_vec(type); + cond = expand_vec_cond(type, vece, &a1, &a2, cond); + /* Expand directly; do not recurse. */ + vec_gen_4(INDEX_op_cmp_vec, type, vece, a0, a1, a2, cond); +} - expand_vec_cmp(type, vece, t, c1, c2, cond); - vec_gen_4(INDEX_op_x86_vpblendvb_vec, type, vece, - tcgv_vec_arg(v0), tcgv_vec_arg(v4), - tcgv_vec_arg(v3), tcgv_vec_arg(t)); - tcg_temp_free_vec(t); +static void expand_vec_cmpsel(TCGType type, unsigned vece, TCGArg a0, + TCGArg a1, TCGArg a2, + TCGArg a3, TCGArg a4, TCGCond cond) +{ + cond = expand_vec_cond(type, vece, &a1, &a2, cond); + /* Expand directly; do not recurse. */ + vec_gen_6(INDEX_op_cmpsel_vec, type, vece, a0, a1, a2, a3, a4, cond); } void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, TCGArg a0, ...) { va_list va; - TCGArg a2; - TCGv_vec v0, v1, v2, v3, v4; + TCGArg a1, a2, a3, a4, a5; + TCGv_vec v0, v1, v2; va_start(va, a0); - v0 = temp_tcgv_vec(arg_temp(a0)); - v1 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); + a1 = va_arg(va, TCGArg); a2 = va_arg(va, TCGArg); + v0 = temp_tcgv_vec(arg_temp(a0)); + v1 = temp_tcgv_vec(arg_temp(a1)); switch (opc) { case INDEX_op_shli_vec: @@ -4122,15 +4140,15 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, break; case INDEX_op_cmp_vec: - v2 = temp_tcgv_vec(arg_temp(a2)); - expand_vec_cmp(type, vece, v0, v1, v2, va_arg(va, TCGArg)); + a3 = va_arg(va, TCGArg); + expand_vec_cmp(type, vece, a0, a1, a2, a3); break; case INDEX_op_cmpsel_vec: - v2 = temp_tcgv_vec(arg_temp(a2)); - v3 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); - v4 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); - expand_vec_cmpsel(type, vece, v0, v1, v2, v3, v4, va_arg(va, TCGArg)); + a3 = va_arg(va, TCGArg); + a4 = va_arg(va, TCGArg); + a5 = va_arg(va, TCGArg); + expand_vec_cmpsel(type, vece, a0, a1, a2, a3, a4, a5); break; default: From patchwork Wed Sep 11 16:50:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800923 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B3038EE0206 for ; Wed, 11 Sep 2024 16:53:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYV-0002AH-L5; Wed, 11 Sep 2024 12:50:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYT-00021z-Bq for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:57 -0400 Received: from mail-pg1-x532.google.com ([2607:f8b0:4864:20::532]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYR-0003eq-C2 for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:57 -0400 Received: by mail-pg1-x532.google.com with SMTP id 41be03b00d2f7-7d4f8a1626cso44836a12.3 for ; Wed, 11 Sep 2024 09:50:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073454; x=1726678254; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9crvsERH/myYSJkgKez5YbnhkNv578itQ81kZkhssYk=; b=Z3H0ie0Y/5S2XO/J2tqaomsBf9dDnAdkGTuhd8BBhvA51efiSADVVg+0ip34Z8y6al PNJQJoovdhurwRNTLzaOwKS+9Ns1CBATROdIBoI2LbcrewgeXzKp9XWLPPnxIVqf7EcK ghge81Xv7VqTtm0FkolJNanPYEQdmEgXqs+AcXj38VXhux7zxs5OgZRdEIQeDKLX2/TP jpqGDpQn4vIEJQI8ynijQgN4+BZ9hlaWm17tBrQD1W60BXAPzzN/T8F5JjpEJNtwOBbj OMDki8T4xb0ZsNCqjnm8Nbcw3kCuKtqXB6a78wuRZQRH20AqKO3eUK9StTKbcMgqTn2I oTog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073454; x=1726678254; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9crvsERH/myYSJkgKez5YbnhkNv578itQ81kZkhssYk=; b=m7/AWcmOiGsCsyIVa2D29M+jVU+vB4XtV8tLgSFbiEse4uvYdZ8g54p0moBnq2pEA7 UJCtKIDkKPqcQc4qjBriF3eynANVbO5eBc8i/8a6aGliLSqvIQDXbuf1xvqLTQxy2eHe EFWZ+pBUDdAspayzp3ugjxHurKKlTjpRSixKs8C2FkhgjbTkrOPUiPAGKXprHaOW0P0a 8j7N9XXicm6OT4H2Mf5KFFdgSBYRrJgHEkDc+d2XucipOlzLHbGfvY9HHDIIuNyKb097 tFDbbtF/pT28UoYlqnbjYnyYgvU9se1y4LRPpxVa6fOMk4DXJZdc//Tkfa+IXi+83RSE I++g== X-Gm-Message-State: AOJu0YxrkkJtR44rKw8/3cgYYubQQTp7+GRN1Q9x/8ddtI3gpKGW1pDO 2AANCzg+A50/Hq/EYC9JU56/HXpP3jjnUkm6JwKcDOKrw0VRMI7DO6FOT8tbzbRIY7hr6lLJ+ly J X-Google-Smtp-Source: AGHT+IGu+J62BcE1uGscyzU1511se5adC1H8xAkhtv3svJuPWSPxr10+fvKM+UCnUBaGX1mnx4RgNA== X-Received: by 2002:a05:6a21:4a4c:b0:1c1:92f8:d3c6 with SMTP id adf61e73a8af0-1cf5e143e33mr7139906637.27.1726073453629; Wed, 11 Sep 2024 09:50:53 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.50.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:50:53 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 05/18] tcg/ppc: Do not expand cmp_vec early Date: Wed, 11 Sep 2024 09:50:34 -0700 Message-ID: <20240911165047.1035764-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::532; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x532.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Move expansion to opcode generation. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.c.inc | 169 +++++++++++++++++++++------------------ 1 file changed, 90 insertions(+), 79 deletions(-) diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc index 3553a47ba9..497e130581 100644 --- a/tcg/ppc/tcg-target.c.inc +++ b/tcg/ppc/tcg-target.c.inc @@ -3567,12 +3567,13 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_usadd_vec: case INDEX_op_ussub_vec: return vece <= MO_32; - case INDEX_op_cmp_vec: case INDEX_op_shli_vec: case INDEX_op_shri_vec: case INDEX_op_sari_vec: case INDEX_op_rotli_vec: return vece <= MO_32 || have_isa_2_07 ? -1 : 0; + case INDEX_op_cmp_vec: + return vece <= MO_32 || have_isa_2_07 ? 1 : 0; case INDEX_op_neg_vec: return vece >= MO_32 && have_isa_3_00; case INDEX_op_mul_vec: @@ -3713,6 +3714,90 @@ static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, return true; } +static void tcg_out_not_vec(TCGContext *s, TCGReg a0, TCGReg a1) +{ + tcg_out32(s, VNOR | VRT(a0) | VRA(a1) | VRB(a1)); +} + +static bool tcg_out_cmp_vec_noinv(TCGContext *s, unsigned vece, TCGReg a0, + TCGReg a1, TCGReg a2, TCGCond cond) +{ + static const uint32_t + eq_op[4] = { VCMPEQUB, VCMPEQUH, VCMPEQUW, VCMPEQUD }, + ne_op[4] = { VCMPNEB, VCMPNEH, VCMPNEW, 0 }, + gts_op[4] = { VCMPGTSB, VCMPGTSH, VCMPGTSW, VCMPGTSD }, + gtu_op[4] = { VCMPGTUB, VCMPGTUH, VCMPGTUW, VCMPGTUD }; + uint32_t insn; + + bool need_swap = false, need_inv = false; + + tcg_debug_assert(vece <= MO_32 || have_isa_2_07); + + switch (cond) { + case TCG_COND_EQ: + case TCG_COND_GT: + case TCG_COND_GTU: + break; + case TCG_COND_NE: + if (have_isa_3_00 && vece <= MO_32) { + break; + } + /* fall through */ + case TCG_COND_LE: + case TCG_COND_LEU: + need_inv = true; + break; + case TCG_COND_LT: + case TCG_COND_LTU: + need_swap = true; + break; + case TCG_COND_GE: + case TCG_COND_GEU: + need_swap = need_inv = true; + break; + default: + g_assert_not_reached(); + } + + if (need_inv) { + cond = tcg_invert_cond(cond); + } + if (need_swap) { + TCGReg swap = a1; + a1 = a2; + a2 = swap; + cond = tcg_swap_cond(cond); + } + + switch (cond) { + case TCG_COND_EQ: + insn = eq_op[vece]; + break; + case TCG_COND_NE: + insn = ne_op[vece]; + break; + case TCG_COND_GT: + insn = gts_op[vece]; + break; + case TCG_COND_GTU: + insn = gtu_op[vece]; + break; + default: + g_assert_not_reached(); + } + tcg_out32(s, insn | VRT(a0) | VRA(a1) | VRB(a2)); + + return need_inv; +} + +static void tcg_out_cmp_vec(TCGContext *s, unsigned vece, TCGReg a0, + TCGReg a1, TCGReg a2, TCGCond cond) +{ + if (tcg_out_cmp_vec_noinv(s, vece, a0, a1, a2, cond)) { + tcg_out_not_vec(s, a0, a0); + } +} + static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, unsigned vece, const TCGArg args[TCG_MAX_OP_ARGS], @@ -3723,10 +3808,6 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, sub_op[4] = { VSUBUBM, VSUBUHM, VSUBUWM, VSUBUDM }, mul_op[4] = { 0, 0, VMULUWM, VMULLD }, neg_op[4] = { 0, 0, VNEGW, VNEGD }, - eq_op[4] = { VCMPEQUB, VCMPEQUH, VCMPEQUW, VCMPEQUD }, - ne_op[4] = { VCMPNEB, VCMPNEH, VCMPNEW, 0 }, - gts_op[4] = { VCMPGTSB, VCMPGTSH, VCMPGTSW, VCMPGTSD }, - gtu_op[4] = { VCMPGTUB, VCMPGTUH, VCMPGTUW, VCMPGTUD }, ssadd_op[4] = { VADDSBS, VADDSHS, VADDSWS, 0 }, usadd_op[4] = { VADDUBS, VADDUHS, VADDUWS, 0 }, sssub_op[4] = { VSUBSBS, VSUBSHS, VSUBSWS, 0 }, @@ -3820,9 +3901,8 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, insn = VANDC; break; case INDEX_op_not_vec: - insn = VNOR; - a2 = a1; - break; + tcg_out_not_vec(s, a0, a1); + return; case INDEX_op_orc_vec: insn = VORC; break; @@ -3837,23 +3917,8 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, break; case INDEX_op_cmp_vec: - switch (args[3]) { - case TCG_COND_EQ: - insn = eq_op[vece]; - break; - case TCG_COND_NE: - insn = ne_op[vece]; - break; - case TCG_COND_GT: - insn = gts_op[vece]; - break; - case TCG_COND_GTU: - insn = gtu_op[vece]; - break; - default: - g_assert_not_reached(); - } - break; + tcg_out_cmp_vec(s, vece, a0, a1, a2, args[3]); + return; case INDEX_op_bitsel_vec: tcg_out32(s, XXSEL | VRT(a0) | VRC(a1) | VRB(a2) | VRA(args[3])); @@ -3921,56 +3986,6 @@ static void expand_vec_shi(TCGType type, unsigned vece, TCGv_vec v0, tcgv_vec_arg(v1), tcgv_vec_arg(t1)); } -static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, - TCGv_vec v1, TCGv_vec v2, TCGCond cond) -{ - bool need_swap = false, need_inv = false; - - tcg_debug_assert(vece <= MO_32 || have_isa_2_07); - - switch (cond) { - case TCG_COND_EQ: - case TCG_COND_GT: - case TCG_COND_GTU: - break; - case TCG_COND_NE: - if (have_isa_3_00 && vece <= MO_32) { - break; - } - /* fall through */ - case TCG_COND_LE: - case TCG_COND_LEU: - need_inv = true; - break; - case TCG_COND_LT: - case TCG_COND_LTU: - need_swap = true; - break; - case TCG_COND_GE: - case TCG_COND_GEU: - need_swap = need_inv = true; - break; - default: - g_assert_not_reached(); - } - - if (need_inv) { - cond = tcg_invert_cond(cond); - } - if (need_swap) { - TCGv_vec t1; - t1 = v1, v1 = v2, v2 = t1; - cond = tcg_swap_cond(cond); - } - - vec_gen_4(INDEX_op_cmp_vec, type, vece, tcgv_vec_arg(v0), - tcgv_vec_arg(v1), tcgv_vec_arg(v2), cond); - - if (need_inv) { - tcg_gen_not_vec(vece, v0, v0); - } -} - static void expand_vec_mul(TCGType type, unsigned vece, TCGv_vec v0, TCGv_vec v1, TCGv_vec v2) { @@ -4045,10 +4060,6 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, case INDEX_op_rotli_vec: expand_vec_shi(type, vece, v0, v1, a2, INDEX_op_rotlv_vec); break; - case INDEX_op_cmp_vec: - v2 = temp_tcgv_vec(arg_temp(a2)); - expand_vec_cmp(type, vece, v0, v1, v2, va_arg(va, TCGArg)); - break; case INDEX_op_mul_vec: v2 = temp_tcgv_vec(arg_temp(a2)); expand_vec_mul(type, vece, v0, v1, v2); From patchwork Wed Sep 11 16:50:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800921 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DD76CEE57C6 for ; Wed, 11 Sep 2024 16:53:20 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYW-0002Ck-9V; Wed, 11 Sep 2024 12:51:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYU-00024z-5T for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:58 -0400 Received: from mail-pg1-x52c.google.com ([2607:f8b0:4864:20::52c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYS-0003ey-2k for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:57 -0400 Received: by mail-pg1-x52c.google.com with SMTP id 41be03b00d2f7-7163489149eso71387a12.1 for ; Wed, 11 Sep 2024 09:50:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073455; x=1726678255; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=m+T4dcqyJxvJdq82j2M/HcYPFc2eysHobPrKz3oAZqo=; b=cvVK1PcvCMWYn35CN/50Cmg/b1JmxKH99U4Wnwe/M1THKD182wjK43f7yu55xiILC3 eqlIrNhBqvoKeE9y7rlw90JMVlgjtf1koKN1yleRq5O+WaWYsWiYOr9Ewm0u/PihP27x mZuwpZcawFQYPfC1Z/DAITw4rVQmQsa+RYB5cLvVZ+kSV8eCMsSK1cw3/sISg26MI81Q BkIXNXFbwb/VncOFXVPsZBJ9fBb3P+NFWv+DcbqTOUP58ZM+S71rMq3bX57wVVxdTYy7 998Nz8Q34AdWvEznSYwcy57c3/Py5e4489zqvDUGPTdb+9whCrApH6dEdU51oAnBTXZG z25g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073455; x=1726678255; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m+T4dcqyJxvJdq82j2M/HcYPFc2eysHobPrKz3oAZqo=; b=ncO8+y2QWfHE7Wkqe2ClJP9/9VvFSLI0j7OCk//bm1zuavK/dDV6VAy1Ua9DtT7aaq ZhippeHnd0BMIsM91YaBSY74ElNlm3coVzAZ6QBz3q5QDtnDTZQupjOF1UG1UL+lRoBf 2f/PTKhJrzMKend3+8JlJlqTR9A+uH0v4i7s7Hr2AM3BFcWzhnzKE3MN5KxxbI/uS6B3 HQ8fH+uoYV4GnZ4Rjgp9kag9DSo5YV16TVUtFQlhqM5xgSaPiv3KImzRb133K6MsvYCD flhCgZN+x+FQKcQ41aiOiFbPT+OX+UAAcL1/h1s3K+L4FJR4j6SIgdRirWPiPiKXwGBr Z8cA== X-Gm-Message-State: AOJu0YxgU/C5trE6XhUL5E58HefwqjgfCn1lmuFkBAwAPbvtOPBjPOuw PnLDSwX2Gv03uF9m+/lnDtMncTl0GNSful1qt26CVwg3YhggZNGR7FkpbZgDG+3AOUH74SiDNCi L X-Google-Smtp-Source: AGHT+IG1WlnVTprn91dLrD91pQiFsNcwtLYy9qZqauWcbpS5Jm3c4v1D+QxnQc7u8q8EYzABrUzqBQ== X-Received: by 2002:a05:6300:41:b0:1cf:1fbd:9299 with SMTP id adf61e73a8af0-1cf5e1f6738mr7490421637.49.1726073454688; Wed, 11 Sep 2024 09:50:54 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.50.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:50:54 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 06/18] tcg/s390x: Do not expand cmp_vec early Date: Wed, 11 Sep 2024 09:50:35 -0700 Message-ID: <20240911165047.1035764-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52c; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Move expansion to opcode generation. Signed-off-by: Richard Henderson --- tcg/s390x/tcg-target.c.inc | 139 +++++++++++++++++-------------------- 1 file changed, 65 insertions(+), 74 deletions(-) diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc index ad587325fc..23935fd0f0 100644 --- a/tcg/s390x/tcg-target.c.inc +++ b/tcg/s390x/tcg-target.c.inc @@ -2841,6 +2841,67 @@ static void tcg_out_dupi_vec(TCGContext *s, TCGType type, unsigned vece, tcg_out_insn(s, VRX, VLREP, dst, TCG_TMP0, TCG_REG_NONE, 0, MO_64); } +static bool tcg_out_cmp_vec_noinv(TCGContext *s, unsigned vece, TCGReg a0, + TCGReg a1, TCGReg a2, TCGCond cond) +{ + bool need_swap = false, need_inv = false; + + switch (cond) { + case TCG_COND_EQ: + case TCG_COND_GT: + case TCG_COND_GTU: + break; + case TCG_COND_NE: + case TCG_COND_LE: + case TCG_COND_LEU: + need_inv = true; + break; + case TCG_COND_LT: + case TCG_COND_LTU: + need_swap = true; + break; + case TCG_COND_GE: + case TCG_COND_GEU: + need_swap = need_inv = true; + break; + default: + g_assert_not_reached(); + } + + if (need_inv) { + cond = tcg_invert_cond(cond); + } + if (need_swap) { + TCGReg swap = a1; + a1 = a2; + a2 = swap; + cond = tcg_swap_cond(cond); + } + + switch (cond) { + case TCG_COND_EQ: + tcg_out_insn(s, VRRc, VCEQ, a0, a1, a2, vece); + break; + case TCG_COND_GT: + tcg_out_insn(s, VRRc, VCH, a0, a1, a2, vece); + break; + case TCG_COND_GTU: + tcg_out_insn(s, VRRc, VCHL, a0, a1, a2, vece); + break; + default: + g_assert_not_reached(); + } + return need_inv; +} + +static void tcg_out_cmp_vec(TCGContext *s, unsigned vece, TCGReg a0, + TCGReg a1, TCGReg a2, TCGCond cond) +{ + if (tcg_out_cmp_vec_noinv(s, vece, a0, a1, a2, cond)) { + tcg_out_insn(s, VRRc, VNO, a0, a0, a0, 0); + } +} + static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, unsigned vece, const TCGArg args[TCG_MAX_OP_ARGS], @@ -2959,19 +3020,7 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, break; case INDEX_op_cmp_vec: - switch ((TCGCond)args[3]) { - case TCG_COND_EQ: - tcg_out_insn(s, VRRc, VCEQ, a0, a1, a2, vece); - break; - case TCG_COND_GT: - tcg_out_insn(s, VRRc, VCH, a0, a1, a2, vece); - break; - case TCG_COND_GTU: - tcg_out_insn(s, VRRc, VCHL, a0, a1, a2, vece); - break; - default: - g_assert_not_reached(); - } + tcg_out_cmp_vec(s, vece, a0, a1, a2, args[3]); break; case INDEX_op_s390_vuph_vec: @@ -3024,8 +3073,8 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_umax_vec: case INDEX_op_umin_vec: case INDEX_op_xor_vec: - return 1; case INDEX_op_cmp_vec: + return 1; case INDEX_op_cmpsel_vec: case INDEX_op_rotrv_vec: return -1; @@ -3039,68 +3088,14 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) } } -static bool expand_vec_cmp_noinv(TCGType type, unsigned vece, TCGv_vec v0, - TCGv_vec v1, TCGv_vec v2, TCGCond cond) -{ - bool need_swap = false, need_inv = false; - - switch (cond) { - case TCG_COND_EQ: - case TCG_COND_GT: - case TCG_COND_GTU: - break; - case TCG_COND_NE: - case TCG_COND_LE: - case TCG_COND_LEU: - need_inv = true; - break; - case TCG_COND_LT: - case TCG_COND_LTU: - need_swap = true; - break; - case TCG_COND_GE: - case TCG_COND_GEU: - need_swap = need_inv = true; - break; - default: - g_assert_not_reached(); - } - - if (need_inv) { - cond = tcg_invert_cond(cond); - } - if (need_swap) { - TCGv_vec t1; - t1 = v1, v1 = v2, v2 = t1; - cond = tcg_swap_cond(cond); - } - - vec_gen_4(INDEX_op_cmp_vec, type, vece, tcgv_vec_arg(v0), - tcgv_vec_arg(v1), tcgv_vec_arg(v2), cond); - - return need_inv; -} - -static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, - TCGv_vec v1, TCGv_vec v2, TCGCond cond) -{ - if (expand_vec_cmp_noinv(type, vece, v0, v1, v2, cond)) { - tcg_gen_not_vec(vece, v0, v0); - } -} - static void expand_vec_cmpsel(TCGType type, unsigned vece, TCGv_vec v0, TCGv_vec c1, TCGv_vec c2, TCGv_vec v3, TCGv_vec v4, TCGCond cond) { TCGv_vec t = tcg_temp_new_vec(type); - if (expand_vec_cmp_noinv(type, vece, t, c1, c2, cond)) { - /* Invert the sense of the compare by swapping arguments. */ - tcg_gen_bitsel_vec(vece, v0, t, v4, v3); - } else { - tcg_gen_bitsel_vec(vece, v0, t, v3, v4); - } + tcg_gen_cmp_vec(cond, vece, t, c1, c2); + tcg_gen_bitsel_vec(vece, v0, t, v3, v4); tcg_temp_free_vec(t); } @@ -3153,10 +3148,6 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, v2 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); switch (opc) { - case INDEX_op_cmp_vec: - expand_vec_cmp(type, vece, v0, v1, v2, va_arg(va, TCGArg)); - break; - case INDEX_op_cmpsel_vec: v3 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); v4 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); From patchwork Wed Sep 11 16:50:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800918 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1A807EE57C6 for ; Wed, 11 Sep 2024 16:53:02 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYW-0002EJ-M1; Wed, 11 Sep 2024 12:51:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYU-00027L-Pl for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:58 -0400 Received: from mail-pg1-x531.google.com ([2607:f8b0:4864:20::531]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYT-0003fF-6t for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:58 -0400 Received: by mail-pg1-x531.google.com with SMTP id 41be03b00d2f7-7d4f85766f0so62239a12.2 for ; Wed, 11 Sep 2024 09:50:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073456; x=1726678256; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OWl7Xv2JGXBckAdLaFuhNWtujP2/LH0xkeEqhVhx7Pk=; b=cTnzXQ3tFsRr2YACtK5S09dtpXU64mhaoMeXSU6bhHFk9/hrfxBF1SXnjpYg6qP4vQ rkeHr6yRbs5PQutjnJpW7x+U77IzVsZHLhpwXD33x/D3uCGetBNFycDCBBJcox92U38M +9Zpgf82cgsiIwSU/ynYPMj+cqweZ/Got6ApdrzkGO3dYVwC+hATP+aj0d6ur6Pgxpaw Vs1ODoSLefB2G1VKc+86TwvMpEa+oVf6VvSjBenrmaeHGrsx7KFN2VyENq/ZXp0MIJSs EgaAPX7sBleNpaLqtIfB2Tsyh3Bt2ToA8XjLB8yYMk3D8iscs81NojV0HYKD5kr7ZC4w n21Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073456; x=1726678256; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OWl7Xv2JGXBckAdLaFuhNWtujP2/LH0xkeEqhVhx7Pk=; b=jrPEY8WXiV+vFmpwPGWpMCDdUSJLAqXiGq7eW3i+0QUgHnEwIJORaRGXJVVZ7Z9q71 h/O/84oGiMbGsqu554e8XZaRU/6l2aHhnTLygSYlxij+OUn/W0/TLTvxMFiyTP2RIT/U CWjoqopvs3MzN8yvUKxvARfjWMwcsrZZyamKacFEbkoRDTJut3catVU9bp6P/F0kltgz dz0BLp1SIFRc52kSYVs23d3RdtWEkmiMJXVnX6QhokIAbYGGJAKZWxEtb/uXfTw+yTGR BCrQkCOrsr6rSRVWvfQ/4kMqxX32ONs92kIA6O3cCZiB9RB1+fwVZtm0LB/kWDx8xbAy /B4A== X-Gm-Message-State: AOJu0YzBR2S6KmQAqPv9BPWQhH2ZZm6kApE5EIjlilhnCXCzQQhrzhdh 9VvMDh29DLfwvezqsMepipZg72tzBm161N5r01D1MYE7+r+pjhZwrddtmHRPzG4waXnefl1Khgw I X-Google-Smtp-Source: AGHT+IGC2AL6z0GKInsRR+HteUSB03nd4G2bjhHWkkNJSRlq7DkDuQkM9t/SbCX3LBZzcNtF4vSZYw== X-Received: by 2002:a05:6a20:4499:b0:1ce:d418:a45c with SMTP id adf61e73a8af0-1cf62d88a65mr4115749637.50.1726073455567; Wed, 11 Sep 2024 09:50:55 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.50.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:50:55 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 07/18] tcg/optimize: Fold movcond with true and false values identical Date: Wed, 11 Sep 2024 09:50:36 -0700 Message-ID: <20240911165047.1035764-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::531; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x531.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Fold "x = cond ? y : y" to "x = y". Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/tcg/optimize.c b/tcg/optimize.c index ba16ec27e2..cf311790e0 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -1851,6 +1851,11 @@ static bool fold_movcond(OptContext *ctx, TCGOp *op) { int i; + /* If true and false values are the same, eliminate the cmp. */ + if (args_are_copies(op->args[3], op->args[4])) { + return tcg_opt_gen_mov(ctx, op, op->args[0], op->args[3]); + } + /* * Canonicalize the "false" input reg to match the destination reg so * that the tcg backend can implement a "move if true" operation. From patchwork Wed Sep 11 16:50:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800915 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6587FEE57C6 for ; Wed, 11 Sep 2024 16:52:51 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYX-0002JI-U3; Wed, 11 Sep 2024 12:51:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYV-0002An-Oc for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:59 -0400 Received: from mail-pg1-x529.google.com ([2607:f8b0:4864:20::529]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYU-0003fR-5h for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:50:59 -0400 Received: by mail-pg1-x529.google.com with SMTP id 41be03b00d2f7-7db174f9050so30673a12.3 for ; Wed, 11 Sep 2024 09:50:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073457; x=1726678257; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tk0wGAYaA1BFCH8Bi5mZsOhd6+SwTodI8yFR1/zj/XA=; b=FlXBPM95djXqNx3Cm/ZUu/skoabuT4/rvdoEIgRp+jdM+Whfwczis24DJyL28ibXAG 3lKAZW0fVBA47XqunI6+DRHrFqWetkE46Nt0dMG/SpR9+ucSj8eUox2KwBApJDggdl40 uu+5tPRBgb2MWZ4lTU68aj5eJFDYsny+U1qQKTGYz1DuOlkNZqjpXMcS8le3lKt2yP7L 0i5taD+z4pOWt8MmSpAwlHyMSgj511YLzovK+p24mddpYx/P/3AchQpN7IFK66gHH7R3 X0tsQtGEJWUExbmR59i0iHe/cg+IIPJo9MjPWUZat1HcHsTvvJeXEkGEkLJhwVcL886b lmVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073457; x=1726678257; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tk0wGAYaA1BFCH8Bi5mZsOhd6+SwTodI8yFR1/zj/XA=; b=vzNMcmEPtAHbo2uHzyZqAQPGjeBjymD3aCAxdt5lUjQ7UuO6umlj0UfLSm222E/6Hs u2adYoeM6YidjhFeh+nq9269pRGqTatODrJGa19VQuBnbh4WJYwDPIXeha644WkivMAw z5O+5+J2jlmdU24Q1P1OieDd16yp0khkCleqZxCdtH67Iz0FswRsJXOUaVW6rAtCUkb0 wGQ37CF/RWtUyEBLMqHrGd9ZfzZdCELrZ8FGUS3ivQa/78ZXqdp+cBjoCdYOfkGRMODh 0vgzglvSb+GVtUviPHEAGNDJTdqcy5+FG2kf3mbeFNivvWwLT61lTkEholKl/DKHXKki mcJg== X-Gm-Message-State: AOJu0YzBVRcvavn+fr99Uy1LDkuD0azLmos+qvM7VsJpU+ZuV3X1UnfV vTFUr3MjHOGu0rnu66/m9SaNoBVtrkLR2Oxf9VbPTUvF/uEWlhJ6xxgkx3IrdoWTvqEaFrUn9xV + X-Google-Smtp-Source: AGHT+IFOMOY//SS+VRTmGq00IjikX+qfk2DEF+o3pPHtEd10lPMxlyL5IBu9kJxGIO4uUzaKC+xIqQ== X-Received: by 2002:a05:6a21:3318:b0:1cf:54bd:393a with SMTP id adf61e73a8af0-1cf5e15780fmr5944984637.37.1726073456480; Wed, 11 Sep 2024 09:50:56 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.50.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:50:56 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 08/18] tcg/optimize: Optimize cmp_vec and cmpsel_vec Date: Wed, 11 Sep 2024 09:50:37 -0700 Message-ID: <20240911165047.1035764-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::529; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x529.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Place immediate values second in the comparison. Place destination matches first in the true/false values. All of this mirrors what we do for integer setcond and movcond. Signed-off-by: Richard Henderson --- tcg/optimize.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/tcg/optimize.c b/tcg/optimize.c index cf311790e0..f11f576fd4 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -2422,6 +2422,36 @@ static bool fold_setcond2(OptContext *ctx, TCGOp *op) return tcg_opt_gen_movi(ctx, op, op->args[0], i); } +static bool fold_cmp_vec(OptContext *ctx, TCGOp *op) +{ + /* Canonicalize the comparison to put immediate second. */ + if (swap_commutative(NO_DEST, &op->args[1], &op->args[2])) { + op->args[3] = tcg_swap_cond(op->args[3]); + } + return false; +} + +static bool fold_cmpsel_vec(OptContext *ctx, TCGOp *op) +{ + /* If true and false values are the same, eliminate the cmp. */ + if (args_are_copies(op->args[3], op->args[4])) { + return tcg_opt_gen_mov(ctx, op, op->args[0], op->args[3]); + } + + /* Canonicalize the comparison to put immediate second. */ + if (swap_commutative(NO_DEST, &op->args[1], &op->args[2])) { + op->args[5] = tcg_swap_cond(op->args[5]); + } + /* + * Canonicalize the "false" input reg to match the destination, + * so that the tcg backend can implement "move if true". + */ + if (swap_commutative(op->args[0], &op->args[4], &op->args[3])) { + op->args[5] = tcg_invert_cond(op->args[5]); + } + return false; +} + static bool fold_sextract(OptContext *ctx, TCGOp *op) { uint64_t z_mask, s_mask, s_mask_old; @@ -2928,6 +2958,12 @@ void tcg_optimize(TCGContext *s) case INDEX_op_setcond2_i32: done = fold_setcond2(&ctx, op); break; + case INDEX_op_cmp_vec: + done = fold_cmp_vec(&ctx, op); + break; + case INDEX_op_cmpsel_vec: + done = fold_cmpsel_vec(&ctx, op); + break; CASE_OP_32_64(sextract): done = fold_sextract(&ctx, op); break; From patchwork Wed Sep 11 16:50:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800906 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B672CEE57C7 for ; Wed, 11 Sep 2024 16:51:48 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYa-0002U8-Is; Wed, 11 Sep 2024 12:51:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYX-0002Ft-1q for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:01 -0400 Received: from mail-oa1-x33.google.com ([2001:4860:4864:20::33]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYV-0003fk-D8 for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:00 -0400 Received: by mail-oa1-x33.google.com with SMTP id 586e51a60fabf-277c742307fso761038fac.2 for ; Wed, 11 Sep 2024 09:50:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073458; x=1726678258; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+zeN6h7OSkzIznAtGXkS2TM05Vzv6xYEK4cYNsvDB5Y=; b=aSuj+BuU9IXLkS1fISfjs6vBo67MaHDFyv2kTvaGry/gL/ML4xgQKUXXvbjWLDgm3j huoCk8RY37Gz7AKisSAe2hPCBr4rtXeLZuVVXNMbdAf6p5Nj98v7HMOlo7L806nJ7DSa F7+NMGqF89SQBb2Dm3TdZwXqhbqdMnmB+U4Sn9ZdNg9RWV6glagu794uU9FT8DTA0oRd yTpQO9l+vSKR0Dy2yDiYTrmTn9as1wMEP6N9GgT3Tzmeq/hapUcCl/EXsJcnVYLTnH57 zm2zLmUbUTOnNlf+XvYyFIM2c0AJIro/qg2tCrXpDiDTdS1yKQxiZFw61TiQuj49FjK4 wxdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073458; x=1726678258; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+zeN6h7OSkzIznAtGXkS2TM05Vzv6xYEK4cYNsvDB5Y=; b=WR2vAofLfQ5Zv9xZ8sT2P60XbQm5J6igtIfpAOx9pmkQHw4446+UmVcq2339K62UOy wS/yD7jls7KO3+x0ATmz5V6LcnQAs4HNx+GhY7EMUF0VgeXD9N6zDHWB3bvWiebhLlOl 7QNlBM1YK2jkUS3W9Gtr7pnwHzkW7ut6gYSZI5EI+wwYx+94fC0FsOoiP3OMpfbK3dp6 3toaLbGmX3eJRTWy8c2Qo0zP4sSQtTJw3x13lJItpcU0yfscmK0DEq+s/N5YC9dkSx3K BEZQ5he+Z31uoi/B2mkuHevH+ZZnqhc6pXFepzmqstcWv6LGl2kV+l5W3LwvdnCGhzQc RtXA== X-Gm-Message-State: AOJu0YwYLgsb1gzkPD+KDgAhWpXHsclCfa8k76yCEjgjvIJ/DdeNrUBt kOH+p5XdOcsxDnJntS8Lho6sDDCGJ3t3ZGe2NtbFbpaswuNJcXHmVFijUiUXqiO0WnQDc8MBhAB 1 X-Google-Smtp-Source: AGHT+IHuRXHhtD77WFfmeAFubyLtFkBWZJiGbfZsyo4rs84lAxbagniW6JV/l62Eq+sNyVxProCHuA== X-Received: by 2002:a05:6870:288e:b0:278:234e:6e3f with SMTP id 586e51a60fabf-27c1b53e12emr2798404fac.25.1726073457524; Wed, 11 Sep 2024 09:50:57 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.50.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:50:57 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 09/18] tcg/optimize: Optimize bitsel_vec Date: Wed, 11 Sep 2024 09:50:38 -0700 Message-ID: <20240911165047.1035764-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2001:4860:4864:20::33; envelope-from=richard.henderson@linaro.org; helo=mail-oa1-x33.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Fold matching true/false operands. Fold true/false operands with 0/-1 to simpler logicals. Signed-off-by: Richard Henderson --- tcg/optimize.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/tcg/optimize.c b/tcg/optimize.c index f11f576fd4..e9ef16b3c6 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -2737,6 +2737,61 @@ static bool fold_xor(OptContext *ctx, TCGOp *op) return fold_masks(ctx, op); } +static bool fold_bitsel_vec(OptContext *ctx, TCGOp *op) +{ + /* If true and false values are the same, eliminate the cmp. */ + if (args_are_copies(op->args[2], op->args[3])) { + return tcg_opt_gen_mov(ctx, op, op->args[0], op->args[2]); + } + + if (arg_is_const(op->args[2]) && arg_is_const(op->args[3])) { + uint64_t tv = arg_info(op->args[2])->val; + uint64_t fv = arg_info(op->args[3])->val; + + if (tv == -1 && fv == 0) { + return tcg_opt_gen_mov(ctx, op, op->args[0], op->args[1]); + } + if (tv == 0 && fv == -1) { + if (TCG_TARGET_HAS_not_vec) { + op->opc = INDEX_op_not_vec; + return fold_not(ctx, op); + } else { + op->opc = INDEX_op_xor_vec; + op->args[2] = arg_new_constant(ctx, -1); + return fold_xor(ctx, op); + } + } + } + if (arg_is_const(op->args[2])) { + uint64_t tv = arg_info(op->args[2])->val; + if (tv == -1) { + op->opc = INDEX_op_or_vec; + op->args[2] = op->args[3]; + return fold_or(ctx, op); + } + if (tv == 0 && TCG_TARGET_HAS_andc_vec) { + op->opc = INDEX_op_andc_vec; + op->args[2] = op->args[1]; + op->args[1] = op->args[3]; + return fold_andc(ctx, op); + } + } + if (arg_is_const(op->args[3])) { + uint64_t fv = arg_info(op->args[3])->val; + if (fv == 0) { + op->opc = INDEX_op_and_vec; + return fold_and(ctx, op); + } + if (fv == -1 && TCG_TARGET_HAS_orc_vec) { + op->opc = INDEX_op_orc_vec; + op->args[2] = op->args[1]; + op->args[1] = op->args[3]; + return fold_orc(ctx, op); + } + } + return false; +} + /* Propagate constants and copies, fold constant expressions. */ void tcg_optimize(TCGContext *s) { @@ -2964,6 +3019,9 @@ void tcg_optimize(TCGContext *s) case INDEX_op_cmpsel_vec: done = fold_cmpsel_vec(&ctx, op); break; + case INDEX_op_bitsel_vec: + done = fold_bitsel_vec(&ctx, op); + break; CASE_OP_32_64(sextract): done = fold_sextract(&ctx, op); break; From patchwork Wed Sep 11 16:50:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800910 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 09F82EE57C6 for ; Wed, 11 Sep 2024 16:52:05 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYY-0002Lw-Dt; Wed, 11 Sep 2024 12:51:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYX-0002IT-Md for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:01 -0400 Received: from mail-pf1-x42d.google.com ([2607:f8b0:4864:20::42d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYV-0003fv-VL for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:01 -0400 Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-7191ee537cbso644796b3a.2 for ; Wed, 11 Sep 2024 09:50:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073458; x=1726678258; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mDvGY07edr19JucuS9HGBne2HdSXpx09i/5YVq2X/eo=; b=rSLE1204Pj1Aw7mH5qj1F9S1E+QdWCTeD29iWfTY1TyY/+tQt5KRSWOHnXpR6tY2RI P0TwkBxtuhpW9g+S/JWpPWAAu45JxJrYPhGVlaxzVXlqecsZPlZo62SnUAnSRQFE8bln Z8vK1ZGDcUAoqk4HzOHgx84sOQcj/XekccTctjWKRCm0UM1kxy7XvhIpcmbi7E4AUamm h7VVLE/Cpu8AsMfSJkozyGltG2z5dnb/BMgb1aD+W+JcdoAry+ptRnMejE+ZC1hQMOkF IK38OisPQWnIp/oBeuZ1JB+KREV44d1iOCu6mEcJzZwcQ4DCuuUhTmNKl5r6kbYq1EYr KqwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073458; x=1726678258; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mDvGY07edr19JucuS9HGBne2HdSXpx09i/5YVq2X/eo=; b=rxyTqyFe4JgrArWtSTlASW8Jprj5viK2kX3Nx/BwNkxB6oKGyQC72JDK0DiLmJz1uz zOkx6kTs3rmDMSlBLblgcTj8NqMl5JJix93NNGYWHvQXO7cWukRv+1cR0kuVAoqJiGDb 54Eiw0MR8KWjvJ5cb2QYFAOZMAJCrk0v2f2m6OxB8FeDmiapb2kHbr/C2tT2eM8COGMM xV5LylbodT8Xm99GPRXR/gSk4cL8K2QmLwCQwFL7wsr+MWSVlyyyETwjGV/kFgxvBkNB g7NyxFcZX08bUxmx8ynSJp+DfFiUwSqZwM7o0T4YMroejrkIuzqxz3Y8ST4HaOnwguHs 6QLA== X-Gm-Message-State: AOJu0Yxb/3zVewVl6usxUrIuM8kjVDbJ7y/wVoimXML8LHMkYk48XJH5 r8b24QEoVITV5V2/xnYbwJCJy6TDO0cBydhih/Hoqn8leOGeOLDZye3pof7iTHyhYXKkMgaXsVK P X-Google-Smtp-Source: AGHT+IHExn+eElbtJnT0u7zmfxKvxJE3VJQimjRXK1Ud3KG83Kzx4SBdzwO42aOGCEdaBa7mGEAfXg== X-Received: by 2002:a05:6a21:39b:b0:1bd:2214:e92f with SMTP id adf61e73a8af0-1cf5e075b42mr7085470637.14.1726073458526; Wed, 11 Sep 2024 09:50:58 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.50.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:50:58 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 10/18] tcg/i386: Optimize cmpsel with constant 0 operand 3. Date: Wed, 11 Sep 2024 09:50:39 -0700 Message-ID: <20240911165047.1035764-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42d; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org These can be simplified to and/andc, avoiding the load of the zero into a register. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target-con-set.h | 2 +- tcg/i386/tcg-target-con-str.h | 1 + tcg/i386/tcg-target.c.inc | 32 +++++++++++++++++++++++++------- 3 files changed, 27 insertions(+), 8 deletions(-) diff --git a/tcg/i386/tcg-target-con-set.h b/tcg/i386/tcg-target-con-set.h index da4411d96b..06e6521001 100644 --- a/tcg/i386/tcg-target-con-set.h +++ b/tcg/i386/tcg-target-con-set.h @@ -50,7 +50,7 @@ C_N1_I2(r, r, r) C_N1_I2(r, r, rW) C_O1_I3(x, 0, x, x) C_O1_I3(x, x, x, x) -C_O1_I4(x, x, x, x, x) +C_O1_I4(x, x, x, xO, x) C_O1_I4(r, r, reT, r, 0) C_O1_I4(r, r, r, ri, ri) C_O2_I1(r, r, L) diff --git a/tcg/i386/tcg-target-con-str.h b/tcg/i386/tcg-target-con-str.h index cc22db227b..52142ab121 100644 --- a/tcg/i386/tcg-target-con-str.h +++ b/tcg/i386/tcg-target-con-str.h @@ -28,6 +28,7 @@ REGS('s', ALL_BYTEL_REGS & ~SOFTMMU_RESERVE_REGS) /* qemu_st8_i32 data */ */ CONST('e', TCG_CT_CONST_S32) CONST('I', TCG_CT_CONST_I32) +CONST('O', TCG_CT_CONST_ZERO) CONST('T', TCG_CT_CONST_TST) CONST('W', TCG_CT_CONST_WSZ) CONST('Z', TCG_CT_CONST_U32) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index a04dc7d270..210389955d 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -133,6 +133,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot) #define TCG_CT_CONST_I32 0x400 #define TCG_CT_CONST_WSZ 0x800 #define TCG_CT_CONST_TST 0x1000 +#define TCG_CT_CONST_ZERO 0x2000 /* Registers used with L constraint, which are the first argument registers on x86_64, and two random call clobbered registers on @@ -226,6 +227,9 @@ static bool tcg_target_const_match(int64_t val, int ct, if ((ct & TCG_CT_CONST_WSZ) && val == (type == TCG_TYPE_I32 ? 32 : 64)) { return 1; } + if ((ct & TCG_CT_CONST_ZERO) && val == 0) { + return 1; + } return 0; } @@ -3119,13 +3123,27 @@ static void tcg_out_cmpsel_vec(TCGContext *s, TCGType type, unsigned vece, TCGReg v0, TCGReg c1, TCGReg c2, TCGReg v3, TCGReg v4, TCGCond cond) { - if (tcg_out_cmp_vec_noinv(s, type, vece, TCG_TMP_VEC, c1, c2, cond)) { - TCGReg swap = v3; - v3 = v4; - v4 = swap; + bool inv = tcg_out_cmp_vec_noinv(s, type, vece, TCG_TMP_VEC, c1, c2, cond); + + /* + * Since XMM0 is 16, the only way we get 0 into V3 + * is via the constant zero constraint. + */ + if (!v3) { + if (inv) { + tcg_out_vex_modrm_type(s, OPC_PAND, v0, TCG_TMP_VEC, v4, type); + } else { + tcg_out_vex_modrm_type(s, OPC_PANDN, v0, TCG_TMP_VEC, v4, type); + } + } else { + if (inv) { + TCGReg swap = v3; + v3 = v4; + v4 = swap; + } + tcg_out_vex_modrm_type(s, OPC_VPBLENDVB, v0, v4, v3, type); + tcg_out8(s, (TCG_TMP_VEC - TCG_REG_XMM0) << 4); } - tcg_out_vex_modrm_type(s, OPC_VPBLENDVB, v0, v4, v3, type); - tcg_out8(s, (TCG_TMP_VEC - TCG_REG_XMM0) << 4); } static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, @@ -3716,7 +3734,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) case INDEX_op_bitsel_vec: return C_O1_I3(x, x, x, x); case INDEX_op_cmpsel_vec: - return C_O1_I4(x, x, x, x, x); + return C_O1_I4(x, x, x, xO, x); default: g_assert_not_reached(); From patchwork Wed Sep 11 16:50:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 194A3EE57C6 for ; Wed, 11 Sep 2024 16:52:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYg-0002iP-7h; Wed, 11 Sep 2024 12:51:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYZ-0002PC-A1 for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:03 -0400 Received: from mail-pf1-x42e.google.com ([2607:f8b0:4864:20::42e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYX-0003g9-8R for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:02 -0400 Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-7191ee537cbso644807b3a.2 for ; Wed, 11 Sep 2024 09:51:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073460; x=1726678260; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wC8NJvzs4xm1iFMunKJrmEEYWEWZeuBA49NgWvG2FqE=; b=jrgnsqxtrWpaG7vwfiwVlYoxE2wyA530GTWXX7+9lfpN9qPn1//Kg8VF16I8B+yXLq kyZ8h9qRtVz83HX7klv113ZRo4lQjVyZRBPWv1sSF81HY2hzl60VE51QI6gqmq3P0aEs immiPIHFWwt96uCQBCPUCQ0RRGZyVOzSk0EpEvojVozJ9qKL41p/Zozn+Q0RjuYmCgoJ FWTPH2aDyFurXRYPyHy8Du73nwHbBHF6wuHIyLSBqSb3kvC/TKYZD/sgN72qVUXJiQyc 0iyoM3D6bsDDHXFONyFyTIBmAbVhxuLKFkjPSeSYbNI9irnR8587bRk3boKnHDAewUOL lKoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073460; x=1726678260; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wC8NJvzs4xm1iFMunKJrmEEYWEWZeuBA49NgWvG2FqE=; b=aPB/yB5VM/O2THbaHwEpXTxKtPtZA+BnanGL2GrdRm4GnPM2wa2h0c4PyROnlT/dHy JuC9+eKSDeVYMp4NURk5rMugAUbl2Wx+loN1S6w/A/snPavlfkAVFFIT/yeE03uAhkKI kjFL1pJht9A5ZJyssu2+tM3bMChAHPFl3OfBMQNhDqUotps69l/mkYPr0VfYAVG1vtNK iO1RK+G1HqJWqoBM1rywoU56LN/qqjfZ67VjTzaJkJMZMsamS/2H3YdM6cloqqaIZryJ DazDhbDLQZqX3L3321Wec75M6KnpJyV+MpH7ep5ia0bQLfVrIy6zu9amWSNKnJaDSzD+ 2GSg== X-Gm-Message-State: AOJu0YyWZXiGrIna6bfNMsDizQQ+k05B4TqSNDBB4CnHz1qckf2d49Wh VEX06WLFcFnfOjp7nCzsXbE6DJfOdCzhPfZiQUit4ieGJTkMxohITxfNa9EGJRHs5wTmz1ju8nQ I X-Google-Smtp-Source: AGHT+IG+EsJeg+se6UGL3e8yKicFQLksj3SlmzyiVHY8wiup4cjxaZs3AyPdFYPglxLi3ZEijiCVDg== X-Received: by 2002:a05:6a00:138f:b0:70e:91ca:32ab with SMTP id d2e1a72fcca58-71926065435mr24403b3a.6.1726073459565; Wed, 11 Sep 2024 09:50:59 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.50.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:50:59 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 11/18] tcg/i386: Implement cmp_vec with avx512 insns Date: Wed, 11 Sep 2024 09:50:40 -0700 Message-ID: <20240911165047.1035764-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42e; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org The sse/avx instruction set only has EQ and GT as direct comparisons. Other signed comparisons can be generated from swapping and inversion. However unsigned comparisons are not available and must be transformed to signed comparisons by biasing the inputs. The avx512 instruction set has a complete set of comparisons, with results placed into a predicate register. We can produce the normal cmp_vec result by using VPMOVM2*. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 64 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 63 insertions(+), 1 deletion(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 210389955d..b1d642fc67 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -413,6 +413,14 @@ static bool tcg_target_const_match(int64_t val, int ct, #define OPC_UD2 (0x0b | P_EXT) #define OPC_VPBLENDD (0x02 | P_EXT3A | P_DATA16) #define OPC_VPBLENDVB (0x4c | P_EXT3A | P_DATA16) +#define OPC_VPCMPB (0x3f | P_EXT3A | P_DATA16 | P_EVEX) +#define OPC_VPCMPUB (0x3e | P_EXT3A | P_DATA16 | P_EVEX) +#define OPC_VPCMPW (0x3f | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) +#define OPC_VPCMPUW (0x3e | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) +#define OPC_VPCMPD (0x1f | P_EXT3A | P_DATA16 | P_EVEX) +#define OPC_VPCMPUD (0x1e | P_EXT3A | P_DATA16 | P_EVEX) +#define OPC_VPCMPQ (0x1f | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) +#define OPC_VPCMPUQ (0x1e | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) #define OPC_VPINSRB (0x20 | P_EXT3A | P_DATA16) #define OPC_VPINSRW (0xc4 | P_EXT | P_DATA16) #define OPC_VBROADCASTSS (0x18 | P_EXT38 | P_DATA16) @@ -421,6 +429,10 @@ static bool tcg_target_const_match(int64_t val, int ct, #define OPC_VPBROADCASTW (0x79 | P_EXT38 | P_DATA16) #define OPC_VPBROADCASTD (0x58 | P_EXT38 | P_DATA16) #define OPC_VPBROADCASTQ (0x59 | P_EXT38 | P_DATA16) +#define OPC_VPMOVM2B (0x28 | P_EXT38 | P_SIMDF3 | P_EVEX) +#define OPC_VPMOVM2W (0x28 | P_EXT38 | P_SIMDF3 | P_VEXW | P_EVEX) +#define OPC_VPMOVM2D (0x38 | P_EXT38 | P_SIMDF3 | P_EVEX) +#define OPC_VPMOVM2Q (0x38 | P_EXT38 | P_SIMDF3 | P_VEXW | P_EVEX) #define OPC_VPERMQ (0x00 | P_EXT3A | P_DATA16 | P_VEXW) #define OPC_VPERM2I128 (0x46 | P_EXT3A | P_DATA16 | P_VEXL) #define OPC_VPROLVD (0x15 | P_EXT38 | P_DATA16 | P_EVEX) @@ -3110,9 +3122,59 @@ static bool tcg_out_cmp_vec_noinv(TCGContext *s, TCGType type, unsigned vece, return fixup & NEED_INV; } +static void tcg_out_cmp_vec_k1(TCGContext *s, TCGType type, unsigned vece, + TCGReg v1, TCGReg v2, TCGCond cond) +{ + static const int cmpm_insn[2][4] = { + { OPC_VPCMPB, OPC_VPCMPW, OPC_VPCMPD, OPC_VPCMPQ }, + { OPC_VPCMPUB, OPC_VPCMPUW, OPC_VPCMPUD, OPC_VPCMPUQ } + }; + static const int cond_ext[16] = { + [TCG_COND_EQ] = 0, + [TCG_COND_NE] = 4, + [TCG_COND_LT] = 1, + [TCG_COND_LTU] = 1, + [TCG_COND_LE] = 2, + [TCG_COND_LEU] = 2, + [TCG_COND_NEVER] = 3, + [TCG_COND_GE] = 5, + [TCG_COND_GEU] = 5, + [TCG_COND_GT] = 6, + [TCG_COND_GTU] = 6, + [TCG_COND_ALWAYS] = 7, + }; + + tcg_out_vex_modrm_type(s, cmpm_insn[is_unsigned_cond(cond)][vece], + /* k1 */ 1, v1, v2, type); + tcg_out8(s, cond_ext[cond]); +} + +static void tcg_out_k1_to_vec(TCGContext *s, TCGType type, + unsigned vece, TCGReg dest) +{ + static const int movm_insn[] = { + OPC_VPMOVM2B, OPC_VPMOVM2W, OPC_VPMOVM2D, OPC_VPMOVM2Q + }; + tcg_out_vex_modrm_type(s, movm_insn[vece], dest, 0, /* k1 */ 1, type); +} + static void tcg_out_cmp_vec(TCGContext *s, TCGType type, unsigned vece, TCGReg v0, TCGReg v1, TCGReg v2, TCGCond cond) { + /* + * With avx512, we have a complete set of comparisons into mask. + * Unless there's a single insn expansion for the comparision, + * expand via a mask in k1. + */ + if ((vece <= MO_16 ? have_avx512bw : have_avx512dq) + && cond != TCG_COND_EQ + && cond != TCG_COND_LT + && cond != TCG_COND_GT) { + tcg_out_cmp_vec_k1(s, type, vece, v1, v2, cond); + tcg_out_k1_to_vec(s, type, vece, v0); + return; + } + if (tcg_out_cmp_vec_noinv(s, type, vece, v0, v1, v2, cond)) { tcg_out_dupi_vec(s, type, vece, TCG_TMP_VEC, -1); tcg_out_vex_modrm_type(s, OPC_PXOR, v0, v0, TCG_TMP_VEC, type); @@ -4078,7 +4140,7 @@ static TCGCond expand_vec_cond(TCGType type, unsigned vece, * We must bias the inputs so that they become signed. * All other swapping and inversion are handled during code generation. */ - if (vece == MO_64 && is_unsigned_cond(cond)) { + if (vece == MO_64 && !have_avx512dq && is_unsigned_cond(cond)) { TCGv_vec v1 = temp_tcgv_vec(arg_temp(*a1)); TCGv_vec v2 = temp_tcgv_vec(arg_temp(*a2)); TCGv_vec t1 = tcg_temp_new_vec(type); From patchwork Wed Sep 11 16:50:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A4E9EEE57C6 for ; Wed, 11 Sep 2024 16:51:48 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYd-0002Z2-6o; Wed, 11 Sep 2024 12:51:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYZ-0002QR-Jk for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:03 -0400 Received: from mail-pf1-x42e.google.com ([2607:f8b0:4864:20::42e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYX-0003gP-UN for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:03 -0400 Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-718d8d6af8fso4618040b3a.3 for ; Wed, 11 Sep 2024 09:51:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073461; x=1726678261; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Z2ZjvXoH00iViMW0slYpXozl5B8ORorypWDbC2UouOc=; b=kZH4iL8tJSQHUybIcKqziBVg1d4rmlzSVGSlwZDccjvchScW3l97PhDdJiFwVKSJsQ zdNJjTSyyxdNDunMk2lv1dXBQQPpRMxgfH40XNkk/xctk5ZD9e6hDkeo27MROdwwMT2x sDN3Bp13sTFO5upJBuIZbqDObjpTgKGOTE96lJcVsdTb2uZfUh4PvFTF1Bm7znHmHlJ3 pyjWJYgynz8eBuC8lGeMspgNy0TsCCp76hsAB8VMoJ4iHDcXUhu8hULkBX4rTw5VEQF/ 9+SXqvcHN0yVmWewyY69TORXeT+ul5S6nJnp+v5OVqnMC5gv97eQqfgY1gTnDX3ELhCu D+2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073461; x=1726678261; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Z2ZjvXoH00iViMW0slYpXozl5B8ORorypWDbC2UouOc=; b=VNa+QvUtJBHJEZZviHdZfxU28CEEYYqX6OCEna10QMY8wuo00neR7wtA6K/EVc45E0 MfJFnPlqpQHXqrZvwg/9UKDtMug8ow+1QLtLZBT+S93d/yzcVxLD9lHVxqwxqbFXCe+U W7uetsPTRdMty+9NboygFKlfcNM5FJB/PYHCIotFDLkYDY5HlU7MRD+OmQGaanvvmuVT ryqxOHpQtrjIc40bWk2uoMlRsIPWecCWmu/jK/bZNbRKCwg1FKXMNbXHrskBelfk4+rY 4fMD7t6l0mOGvR6nf1fXYNuYM0+CmYH3ifYgFzQzTxgSfFYuGRIUf4gqgHu+4LOz9oLP /e6A== X-Gm-Message-State: AOJu0YzGE51dZo4t6QMwolD1kI1xRWdlUCNawCi6mHbbANpMp6xXYta/ 9FcK+OXFxTQNhq4yd5i4EhWtWq+3RWL3wt9DowDUgpmfcds5baUEAsCwpBdXmuNv7R0HOmxmOKe V X-Google-Smtp-Source: AGHT+IHxAMN52lfCmQ1WCCsDro1k+d4gyvu6q4UOhAPHh7y3SdV5iIfY6ki6stf4RIVFHP84z00Mjg== X-Received: by 2002:a05:6a00:b83:b0:70b:a46:7db7 with SMTP id d2e1a72fcca58-719260962c0mr12138b3a.16.1726073460572; Wed, 11 Sep 2024 09:51:00 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.50.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:51:00 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 12/18] tcg/i386: Add predicate parameters to tcg_out_evex_opc Date: Wed, 11 Sep 2024 09:50:41 -0700 Message-ID: <20240911165047.1035764-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42e; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Extend tcg_out_evex_opc to handle the predicate and zero-merging parameters of the evex prefix. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index b1d642fc67..f94a2a2385 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -674,7 +674,7 @@ static void tcg_out_vex_opc(TCGContext *s, int opc, int r, int v, } static void tcg_out_evex_opc(TCGContext *s, int opc, int r, int v, - int rm, int index) + int rm, int index, int aaa, bool z) { /* The entire 4-byte evex prefix; with R' and V' set. */ uint32_t p = 0x08041062; @@ -711,7 +711,9 @@ static void tcg_out_evex_opc(TCGContext *s, int opc, int r, int v, p = deposit32(p, 16, 2, pp); p = deposit32(p, 19, 4, ~v); p = deposit32(p, 23, 1, (opc & P_VEXW) != 0); + p = deposit32(p, 24, 3, aaa); p = deposit32(p, 29, 2, (opc & P_VEXL) != 0); + p = deposit32(p, 31, 1, z); tcg_out32(s, p); tcg_out8(s, opc); @@ -720,7 +722,7 @@ static void tcg_out_evex_opc(TCGContext *s, int opc, int r, int v, static void tcg_out_vex_modrm(TCGContext *s, int opc, int r, int v, int rm) { if (opc & P_EVEX) { - tcg_out_evex_opc(s, opc, r, v, rm, 0); + tcg_out_evex_opc(s, opc, r, v, rm, 0, 0, false); } else { tcg_out_vex_opc(s, opc, r, v, rm, 0); } From patchwork Wed Sep 11 16:50:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800920 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2BA82EE57C6 for ; Wed, 11 Sep 2024 16:53:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYn-00032K-9x; Wed, 11 Sep 2024 12:51:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYa-0002Uw-PA for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:04 -0400 Received: from mail-pf1-x433.google.com ([2607:f8b0:4864:20::433]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYY-0003gY-V2 for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:04 -0400 Received: by mail-pf1-x433.google.com with SMTP id d2e1a72fcca58-71790ed8c2dso6096634b3a.3 for ; Wed, 11 Sep 2024 09:51:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073461; x=1726678261; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zmsViYQuZcRw3+Xw3xnkrlUzHZT6vqkjOcHzmQVj1eI=; b=K+6c07qMAeWsb/aUfoN1xs232iauJXkiIzfL7sNkj1x4SSkgL9MPmcSwaZJQuQLu3u hFrCvq2vnrYJBtdrVmMrnque1d0mfgOL7DgS36rkPt94av9xFvlBi4TOmMb8frqOGQiI IY3vysIIJWcmgtskbIDCTpmmFfPt7ilpCuOlFld9I5N1U48RvlqrdwDkoxbQTsEAWFoS ymMXAxmTsxnuyzM72VdTpeAunSwpaa/99HNw5iwTaAqp8yrjWYsQ+NIdvpCzoeB2Coui ZgE14P0K/cHb83LY8LMybAgvjsburHUMkHiYjVGYq8AbKHs8Osvw8SutjvGUiWc+/YU8 dsDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073461; x=1726678261; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zmsViYQuZcRw3+Xw3xnkrlUzHZT6vqkjOcHzmQVj1eI=; b=fP0QBsBoqodmLJYQkFbhkboP2WoSfuu67V2o7t0KrLMSs/OdZbhuP7oKcCLnoW4sxL iHP7iGqPV832bkoGSEyLfI2/QIdWXQnI9FIHnkh9rcHKxLi0ipngqlOt4qvqxS6ZFH36 j1yXMj7GLYPDbQvJSoBPfx7TT8m389bdn8A8GcGPDWiEIaJrPR775LtybhTw7U1P9UIR Sy57tXhDO3T+EN1cNIx0yDP4ecin6HhJeW1ybMn1gM7tqvEvm8vy7kOCtBFBms3FJYbg EQ9D8mrDF1r2NTWktVCYNkzI94clrLX2pVSizQZRZK5Lbn5GyukRfqrTO+XoIVg6zoj0 0K+g== X-Gm-Message-State: AOJu0YzNfrJBz18Qjm3cvpsX5Lju/rmUj+TNHbrBcYTk+b7DeLX6K1oq o7evtUX0uslrgnszBp0MdsvUpXLicJjzTqm6CT1WHJg0PBEP5mZdGRunr16Hb6CHn5G0VCueZn8 V X-Google-Smtp-Source: AGHT+IE6NjEYdSTgX3awPXzNLWY1wgb/qkt4rZEyTSNYsc74ZE/sU+YKHHoN8ApBWQDXH7+gmcektQ== X-Received: by 2002:a05:6a00:218f:b0:70d:2892:402b with SMTP id d2e1a72fcca58-7192608123fmr18890b3a.7.1726073461551; Wed, 11 Sep 2024 09:51:01 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.51.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:51:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 13/18] tcg/i386: Implement cmpsel_vec with avx512 insns Date: Wed, 11 Sep 2024 09:50:42 -0700 Message-ID: <20240911165047.1035764-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::433; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x433.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org The avx512 vpblendm* instructions exactly implement cmpsel, using a predicate input. Of course this matches nicely with the avx512 predicate comparison instructions. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 44 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 43 insertions(+), 1 deletion(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index f94a2a2385..d473dc7a5e 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -413,6 +413,10 @@ static bool tcg_target_const_match(int64_t val, int ct, #define OPC_UD2 (0x0b | P_EXT) #define OPC_VPBLENDD (0x02 | P_EXT3A | P_DATA16) #define OPC_VPBLENDVB (0x4c | P_EXT3A | P_DATA16) +#define OPC_VPBLENDMB (0x66 | P_EXT38 | P_DATA16 | P_EVEX) +#define OPC_VPBLENDMW (0x66 | P_EXT38 | P_DATA16 | P_VEXW | P_EVEX) +#define OPC_VPBLENDMD (0x64 | P_EXT38 | P_DATA16 | P_EVEX) +#define OPC_VPBLENDMQ (0x64 | P_EXT38 | P_DATA16 | P_VEXW | P_EVEX) #define OPC_VPCMPB (0x3f | P_EXT3A | P_DATA16 | P_EVEX) #define OPC_VPCMPUB (0x3e | P_EXT3A | P_DATA16 | P_EVEX) #define OPC_VPCMPW (0x3f | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) @@ -738,6 +742,16 @@ static void tcg_out_vex_modrm_type(TCGContext *s, int opc, tcg_out_vex_modrm(s, opc, r, v, rm); } +static void tcg_out_evex_modrm_type(TCGContext *s, int opc, int r, int v, + int rm, int aaa, bool z, TCGType type) +{ + if (type == TCG_TYPE_V256) { + opc |= P_VEXL; + } + tcg_out_evex_opc(s, opc, r, v, rm, 0, aaa, z); + tcg_out8(s, 0xc0 | (LOWREGMASK(r) << 3) | LOWREGMASK(rm)); +} + /* Output an opcode with a full "rm + (index< X-Patchwork-Id: 13800924 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6829CEE0206 for ; Wed, 11 Sep 2024 16:53:41 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYu-0003iR-G5; Wed, 11 Sep 2024 12:51:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYc-0002cU-KP for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:07 -0400 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYb-0003gv-1D for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:06 -0400 Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-71798661a52so295b3a.0 for ; Wed, 11 Sep 2024 09:51:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073462; x=1726678262; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yckJKrmE4doUh/0H+FLLTMdnHyJQpPXJPtVZq1bwd4E=; b=GCfVeQAWZqMGq7yuoGkQJeEzAClek/1s4J9QXEQMmK+WujmsiPPilzcFFUAPnX1CeJ xVSoSw9/wficOqIZw7Wr6MlFsk1tlBdZJRMzHDY5LBiIHThEChmKz+zNiDnv04x/W7sD VpHuJUMiej34dEuefF6rO/9DZkS+saQKBsa9P9H8E+SeXr/RE2RH9ngN90YfcTXCiXwa V3pgJrDZtBWyLCniWe59BCg6j7Eduulve79dqep899L64ofDfb1TJM0P2HCwjfx7ZjTb 2OsjUiuC6mSp5uMSY1ntFYc1xT4zzDBRY41np/ipz+W6jbS05vQuMSfLwHtkQfn2H5ky O+Rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073462; x=1726678262; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yckJKrmE4doUh/0H+FLLTMdnHyJQpPXJPtVZq1bwd4E=; b=MbzNUUPOMBZQUym64qVP6m8HOtufXOhsx+acg/D3SkPGn9r6hNy56zIzbLZGXXUHH7 K2BNYF3jnlfzECeYK7WdbUs5Cy3DMDMqEoIlhkleZft5cuKD9v1vNhBu15vMjWDPjumk OGBEt+yJ93fpMjfz6ovro65brSqg0Emf7HF/XrpjCHwPgK+DhO3Q/SiGmyWcruA5CG2b JwwhDxjsiotx4pN4TlYOjn1gQakYSRZrFGZKe+Ctq6B/kRcGbIMbXG603tZU5zRqNO12 Om+CF2Vyw+2IyOQmFOBKCj4SNKgP9DepsksiQIFxELtcZNKEHIv1G5EPCuSIp6IK9+4H z9TQ== X-Gm-Message-State: AOJu0YzgUilla3NTVvtZlcZqAKOcLktM4mCBTqy0N/y7Pc61na83q6WF rJapjS3mibWbb0QATTZtJUaPEsxlVTee/q0KUgTdANzeQy94BzsGLrH9BmIcYp5JM7MhQrNkRZh 0 X-Google-Smtp-Source: AGHT+IEzBw0OIyfANDFPTB4uwT6VtbQR838iCS9GB/82YawqeeA4Uqg1YVSAzrSQpX/lqBV+trdeyw== X-Received: by 2002:a05:6a00:124e:b0:70d:2a1b:422c with SMTP id d2e1a72fcca58-71907ee1396mr10585803b3a.7.1726073462284; Wed, 11 Sep 2024 09:51:02 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.51.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:51:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 14/18] tcg/i386: Implement vector TST{EQ,NE} for avx512 Date: Wed, 11 Sep 2024 09:50:43 -0700 Message-ID: <20240911165047.1035764-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::431; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x431.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.c.inc | 31 ++++++++++++++++++++++++++++--- 2 files changed, 29 insertions(+), 4 deletions(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 342be30c4c..c68ac023d8 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -224,7 +224,7 @@ typedef enum { #define TCG_TARGET_HAS_minmax_vec 1 #define TCG_TARGET_HAS_bitsel_vec have_avx512vl #define TCG_TARGET_HAS_cmpsel_vec 1 -#define TCG_TARGET_HAS_tst_vec 0 +#define TCG_TARGET_HAS_tst_vec have_avx512bw #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) == 0 && ((len) == 8 || (len) == 16)) || \ diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index d473dc7a5e..1bf50f1f62 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -462,6 +462,14 @@ static bool tcg_target_const_match(int64_t val, int ct, #define OPC_VPSRLVD (0x45 | P_EXT38 | P_DATA16) #define OPC_VPSRLVQ (0x45 | P_EXT38 | P_DATA16 | P_VEXW) #define OPC_VPTERNLOGQ (0x25 | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) +#define OPC_VPTESTMB (0x26 | P_EXT38 | P_DATA16 | P_EVEX) +#define OPC_VPTESTMW (0x26 | P_EXT38 | P_DATA16 | P_VEXW | P_EVEX) +#define OPC_VPTESTMD (0x27 | P_EXT38 | P_DATA16 | P_EVEX) +#define OPC_VPTESTMQ (0x27 | P_EXT38 | P_DATA16 | P_VEXW | P_EVEX) +#define OPC_VPTESTNMB (0x26 | P_EXT38 | P_SIMDF3 | P_EVEX) +#define OPC_VPTESTNMW (0x26 | P_EXT38 | P_SIMDF3 | P_VEXW | P_EVEX) +#define OPC_VPTESTNMD (0x27 | P_EXT38 | P_SIMDF3 | P_EVEX) +#define OPC_VPTESTNMQ (0x27 | P_EXT38 | P_SIMDF3 | P_VEXW | P_EVEX) #define OPC_VZEROUPPER (0x77 | P_EXT) #define OPC_XCHG_ax_r32 (0x90) #define OPC_XCHG_EvGv (0x87) @@ -3145,6 +3153,13 @@ static void tcg_out_cmp_vec_k1(TCGContext *s, TCGType type, unsigned vece, { OPC_VPCMPB, OPC_VPCMPW, OPC_VPCMPD, OPC_VPCMPQ }, { OPC_VPCMPUB, OPC_VPCMPUW, OPC_VPCMPUD, OPC_VPCMPUQ } }; + static const int testm_insn[4] = { + OPC_VPTESTMB, OPC_VPTESTMW, OPC_VPTESTMD, OPC_VPTESTMQ + }; + static const int testnm_insn[4] = { + OPC_VPTESTNMB, OPC_VPTESTNMW, OPC_VPTESTNMD, OPC_VPTESTNMQ + }; + static const int cond_ext[16] = { [TCG_COND_EQ] = 0, [TCG_COND_NE] = 4, @@ -3160,9 +3175,19 @@ static void tcg_out_cmp_vec_k1(TCGContext *s, TCGType type, unsigned vece, [TCG_COND_ALWAYS] = 7, }; - tcg_out_vex_modrm_type(s, cmpm_insn[is_unsigned_cond(cond)][vece], - /* k1 */ 1, v1, v2, type); - tcg_out8(s, cond_ext[cond]); + switch (cond) { + case TCG_COND_TSTNE: + tcg_out_vex_modrm_type(s, testm_insn[vece], /* k1 */ 1, v1, v2, type); + break; + case TCG_COND_TSTEQ: + tcg_out_vex_modrm_type(s, testnm_insn[vece], /* k1 */ 1, v1, v2, type); + break; + default: + tcg_out_vex_modrm_type(s, cmpm_insn[is_unsigned_cond(cond)][vece], + /* k1 */ 1, v1, v2, type); + tcg_out8(s, cond_ext[cond]); + break; + } } static void tcg_out_k1_to_vec(TCGContext *s, TCGType type, From patchwork Wed Sep 11 16:50:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E48CCEE57C6 for ; Wed, 11 Sep 2024 16:52:09 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYy-0003xP-DO; Wed, 11 Sep 2024 12:51:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYc-0002ai-9u for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:06 -0400 Received: from mail-pf1-x42d.google.com ([2607:f8b0:4864:20::42d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYa-0003gs-Gg for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:06 -0400 Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-718d962ad64so60b3a.0 for ; Wed, 11 Sep 2024 09:51:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073463; x=1726678263; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DUOSVdnU/ITJXIqpoZvSdwJoVXsC/VkcYgnTu23PytA=; b=nVWhoTUQcQ2+L3XkUBM5XrqWTcGIjV8eK5hnAzr4+TZ6sOh2HW2mxLBDDpeOu726sZ UnvuXpmze86Y9IZAt8U53ecwnhvzAmqaE43zvCtXfHFekwTcyPokf3+7PVsOBD+z1Ko8 vD2QT83UxV1dRreFL4L9v/4/EDdzTlX3abjN5onqR2X0fjmvnaK3gV/+CF2bGS1RrxVR eVgGHLgmBLvVifpGiO94Oj/DcdM+C5od7YTK6NULED5q3X8BLT4KQReqdJTbK6D8HOzE WZqH5f0cLW1vF3zEZIspKOifElGgN5tluXPGyKav4Ftcq7hm3r3Ya3uznxXoxUN6sMo7 DnMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073463; x=1726678263; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DUOSVdnU/ITJXIqpoZvSdwJoVXsC/VkcYgnTu23PytA=; b=F1FMrowESBQjSCKThJdeRzkPxpcCzscANYR/DOsAt+JiHXWobTO3MHRWwsCKMFUfu1 Q+C9IOkm0U3nyb4wkYlQVdT/1FruxiKDphyM1dwM2VPp7LxQT63j+ZCOKwrTNpbTAXDD fsUgSRH5UDdY1RAv7oTy6QOSsRYRT5o7ap1Xjf9DmRGAvh3MTOAHvpbzh2+5JhsFg2GX xjntHp8WtLaFq5gTJU3mNAh9HBV4hgN1lbKflq1vfeMWpOONz2eh7kZJ/pv5RqpfYXvX r4Hyj6RQMMfIHcd9EY7GEqK9tRLgVwCV9B9EwkKYhgOQLsZj7h9/agB/O+vnm/4s2euG 9TbA== X-Gm-Message-State: AOJu0YyaY69vtBD/Pcx5QPKmFcWMyJ+4G2ByrxDUyUpo1eKJ/pzZaZO5 MlDCiVAvY1uci3zBPHw2ubdXeiq9KN4Kn0ehJ8BYrwsA3JW1ir0/tYI3Tv3If5uD3MQZ6F2sdbH 3 X-Google-Smtp-Source: AGHT+IF6glNca4E21LHopu8YEkpGOdyetBtzS/U9d9G+wwpf4Kkr64hN/fyd3+RD5+KPcMd6R3d8bg== X-Received: by 2002:a05:6a00:148a:b0:70d:2e24:af75 with SMTP id d2e1a72fcca58-718d5f06932mr29329988b3a.24.1726073463142; Wed, 11 Sep 2024 09:51:03 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.51.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:51:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 15/18] tcg/ppc: Implement cmpsel_vec Date: Wed, 11 Sep 2024 09:50:44 -0700 Message-ID: <20240911165047.1035764-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42d; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Do not allow cmpsel_vec to be expanded early, so that we can make the correct decision wrt the sense of the comparison. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target-con-set.h | 1 + tcg/ppc/tcg-target.h | 2 +- tcg/ppc/tcg-target.c.inc | 60 +++++++++++++++++++++++++++++++----- 3 files changed, 54 insertions(+), 9 deletions(-) diff --git a/tcg/ppc/tcg-target-con-set.h b/tcg/ppc/tcg-target-con-set.h index 9f99bde505..e7ba00c248 100644 --- a/tcg/ppc/tcg-target-con-set.h +++ b/tcg/ppc/tcg-target-con-set.h @@ -33,6 +33,7 @@ C_O1_I2(r, r, rU) C_O1_I2(r, r, rZW) C_O1_I2(v, v, v) C_O1_I3(v, v, v, v) +C_O1_I4(v, v, v, v, v) C_O1_I4(r, r, rC, rZ, rZ) C_O1_I4(r, r, r, ri, ri) C_O2_I1(r, r, r) diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h index e154fb14df..0b2171d38c 100644 --- a/tcg/ppc/tcg-target.h +++ b/tcg/ppc/tcg-target.h @@ -172,7 +172,7 @@ typedef enum { #define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 1 #define TCG_TARGET_HAS_bitsel_vec have_vsx -#define TCG_TARGET_HAS_cmpsel_vec 0 +#define TCG_TARGET_HAS_cmpsel_vec 1 #define TCG_TARGET_HAS_tst_vec 0 #define TCG_TARGET_DEFAULT_MO (0) diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc index 497e130581..9d07b4d8e6 100644 --- a/tcg/ppc/tcg-target.c.inc +++ b/tcg/ppc/tcg-target.c.inc @@ -3573,6 +3573,7 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_rotli_vec: return vece <= MO_32 || have_isa_2_07 ? -1 : 0; case INDEX_op_cmp_vec: + case INDEX_op_cmpsel_vec: return vece <= MO_32 || have_isa_2_07 ? 1 : 0; case INDEX_op_neg_vec: return vece >= MO_32 && have_isa_3_00; @@ -3719,6 +3720,33 @@ static void tcg_out_not_vec(TCGContext *s, TCGReg a0, TCGReg a1) tcg_out32(s, VNOR | VRT(a0) | VRA(a1) | VRB(a1)); } +static void tcg_out_or_vec(TCGContext *s, TCGReg a0, TCGReg a1, TCGReg a2) +{ + tcg_out32(s, VOR | VRT(a0) | VRA(a1) | VRB(a2)); +} + +static void tcg_out_and_vec(TCGContext *s, TCGReg a0, TCGReg a1, TCGReg a2) +{ + tcg_out32(s, VAND | VRT(a0) | VRA(a1) | VRB(a2)); +} + +static void tcg_out_andc_vec(TCGContext *s, TCGReg a0, TCGReg a1, TCGReg a2) +{ + tcg_out32(s, VANDC | VRT(a0) | VRA(a1) | VRB(a2)); +} + +static void tcg_out_bitsel_vec(TCGContext *s, TCGReg d, + TCGReg c, TCGReg t, TCGReg f) +{ + if (TCG_TARGET_HAS_bitsel_vec) { + tcg_out32(s, XXSEL | VRT(d) | VRC(c) | VRB(t) | VRA(f)); + } else { + tcg_out_and_vec(s, TCG_VEC_TMP2, t, c); + tcg_out_andc_vec(s, d, f, c); + tcg_out_or_vec(s, d, d, TCG_VEC_TMP2); + } +} + static bool tcg_out_cmp_vec_noinv(TCGContext *s, unsigned vece, TCGReg a0, TCGReg a1, TCGReg a2, TCGCond cond) { @@ -3798,6 +3826,18 @@ static void tcg_out_cmp_vec(TCGContext *s, unsigned vece, TCGReg a0, } } +static void tcg_out_cmpsel_vec(TCGContext *s, unsigned vece, TCGReg a0, + TCGReg c1, TCGReg c2, TCGReg v3, TCGReg v4, + TCGCond cond) +{ + if (tcg_out_cmp_vec_noinv(s, vece, TCG_VEC_TMP1, c1, c2, cond)) { + TCGReg swap = v3; + v3 = v4; + v4 = swap; + } + tcg_out_bitsel_vec(s, a0, TCG_VEC_TMP1, v3, v4); +} + static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, unsigned vece, const TCGArg args[TCG_MAX_OP_ARGS], @@ -3889,17 +3929,17 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, insn = sarv_op[vece]; break; case INDEX_op_and_vec: - insn = VAND; - break; + tcg_out_and_vec(s, a0, a1, a2); + return; case INDEX_op_or_vec: - insn = VOR; - break; + tcg_out_or_vec(s, a0, a1, a2); + return; case INDEX_op_xor_vec: insn = VXOR; break; case INDEX_op_andc_vec: - insn = VANDC; - break; + tcg_out_andc_vec(s, a0, a1, a2); + return; case INDEX_op_not_vec: tcg_out_not_vec(s, a0, a1); return; @@ -3919,9 +3959,11 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_cmp_vec: tcg_out_cmp_vec(s, vece, a0, a1, a2, args[3]); return; - + case INDEX_op_cmpsel_vec: + tcg_out_cmpsel_vec(s, vece, a0, a1, a2, args[3], args[4], args[5]); + return; case INDEX_op_bitsel_vec: - tcg_out32(s, XXSEL | VRT(a0) | VRC(a1) | VRB(a2) | VRA(args[3])); + tcg_out_bitsel_vec(s, a0, a1, a2, args[3]); return; case INDEX_op_dup2_vec: @@ -4287,6 +4329,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) case INDEX_op_bitsel_vec: case INDEX_op_ppc_msum_vec: return C_O1_I3(v, v, v, v); + case INDEX_op_cmpsel_vec: + return C_O1_I4(v, v, v, v, v); default: g_assert_not_reached(); From patchwork Wed Sep 11 16:50:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800922 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2BB86EE57C6 for ; Wed, 11 Sep 2024 16:53:28 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYo-0003Ku-99; Wed, 11 Sep 2024 12:51:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYd-0002hJ-QU for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:09 -0400 Received: from mail-pg1-x530.google.com ([2607:f8b0:4864:20::530]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYb-0003h0-JO for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:07 -0400 Received: by mail-pg1-x530.google.com with SMTP id 41be03b00d2f7-7db19de6346so54437a12.2 for ; Wed, 11 Sep 2024 09:51:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073464; x=1726678264; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=r4uFa1EewNZVM8eb9T0XGX/kCpjVJLtWV1XZpTNAvas=; b=QdGBNTQnAezKjfSfr/ApFHdNcyHyjKNX0YlCMNL6WeTQsR48n3pFAPIu2/9sbepw6Z P0RotgXBqtB+S+rbEQz/XiVbiBaGlW/bnXEWQSGgQAd6Eguko/TNHYZQqcGEx8Tteg7d HWYZm3n+8aQePZiNi4Fx1WbF57z0BcKn6G3zhY1DstSpcORV6gGH3uvUa+Js5/9TdV28 R4bl39PXAJi1S33q5seJzVVzej9myICzBprSpNFFkCtkEbad4LjQ+VG/8Dg01IjRKWbW A8vUN7os+iUqBnwBiNZ2kLaSi8cU0rBTNgWiSwoD2HhOW6sGlWOjRU0gbA5ra7BlYA4G CF0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073464; x=1726678264; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=r4uFa1EewNZVM8eb9T0XGX/kCpjVJLtWV1XZpTNAvas=; b=QTmMr1QVPfKkhapriSqbXofydBMJJL6SziTwaqV8KkWfqzne2jGcMVkn9LDyeisDg2 QVxtxzUWJA9Pip8gPcLS3EeOFWcfC+wDMIhol6PPbIiIITTav43bzcQCqTqFYHqwNLTH kH3LXCa2hVUTgo+NCP3goYk/TzgrEIGhqi04sGgwzXEq4LsZSJYNaIsV4rPEKtIDO4t6 +wCH6c2Y70OAZlfCSYBnwqWbGoENxojwKr9CiwZIMQGUPwSpjCKKM9AkL5FxFB+Dz8WC QbuL2bmt+XWpWqi83SIM3BpMIz1JqrXKs8MJ2cA5VMxIo9JpFAoXCI874JrHbSsnQflW z8Jg== X-Gm-Message-State: AOJu0Yxcl1CpplapGK2H3xHZJLsjI0Xo5IZV82VRXSGIg7ioOTBHtkec Txdw7W0BSUZDRIUKQkwTXj51N9DOvB2cZlrqvbALCXC5WI/AAfYuzf46eEJzr1N7sKFtY+CuDT3 E X-Google-Smtp-Source: AGHT+IF7gXFmjuuN+NHTysq0ASm6dtsiywMSCg5hW/S37U0mgcwmqlONYH48C53btduETInqf5/UiQ== X-Received: by 2002:a05:6a20:85a6:b0:1cf:6d68:4a0 with SMTP id adf61e73a8af0-1cf6d68118amr1871707637.22.1726073464058; Wed, 11 Sep 2024 09:51:04 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.51.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:51:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 16/18] tcg/ppc: Optimize cmpsel with constant 0/-1 arguments Date: Wed, 11 Sep 2024 09:50:45 -0700 Message-ID: <20240911165047.1035764-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::530; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x530.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org These can be simplified to and/or/andc/orc, avoiding the load of the constantinto a register. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target-con-set.h | 2 +- tcg/ppc/tcg-target.c.inc | 43 +++++++++++++++++++++++++++--------- 2 files changed, 33 insertions(+), 12 deletions(-) diff --git a/tcg/ppc/tcg-target-con-set.h b/tcg/ppc/tcg-target-con-set.h index e7ba00c248..453abde6c1 100644 --- a/tcg/ppc/tcg-target-con-set.h +++ b/tcg/ppc/tcg-target-con-set.h @@ -33,7 +33,7 @@ C_O1_I2(r, r, rU) C_O1_I2(r, r, rZW) C_O1_I2(v, v, v) C_O1_I3(v, v, v, v) -C_O1_I4(v, v, v, v, v) +C_O1_I4(v, v, v, vZM, v) C_O1_I4(r, r, rC, rZ, rZ) C_O1_I4(r, r, r, ri, ri) C_O2_I1(r, r, r) diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc index 9d07b4d8e6..3f413ce3c1 100644 --- a/tcg/ppc/tcg-target.c.inc +++ b/tcg/ppc/tcg-target.c.inc @@ -3725,6 +3725,11 @@ static void tcg_out_or_vec(TCGContext *s, TCGReg a0, TCGReg a1, TCGReg a2) tcg_out32(s, VOR | VRT(a0) | VRA(a1) | VRB(a2)); } +static void tcg_out_orc_vec(TCGContext *s, TCGReg a0, TCGReg a1, TCGReg a2) +{ + tcg_out32(s, VORC | VRT(a0) | VRA(a1) | VRB(a2)); +} + static void tcg_out_and_vec(TCGContext *s, TCGReg a0, TCGReg a1, TCGReg a2) { tcg_out32(s, VAND | VRT(a0) | VRA(a1) | VRB(a2)); @@ -3827,15 +3832,30 @@ static void tcg_out_cmp_vec(TCGContext *s, unsigned vece, TCGReg a0, } static void tcg_out_cmpsel_vec(TCGContext *s, unsigned vece, TCGReg a0, - TCGReg c1, TCGReg c2, TCGReg v3, TCGReg v4, - TCGCond cond) + TCGReg c1, TCGReg c2, TCGArg v3, int const_v3, + TCGReg v4, TCGCond cond) { - if (tcg_out_cmp_vec_noinv(s, vece, TCG_VEC_TMP1, c1, c2, cond)) { - TCGReg swap = v3; - v3 = v4; - v4 = swap; + bool inv = tcg_out_cmp_vec_noinv(s, vece, TCG_VEC_TMP1, c1, c2, cond); + + if (!const_v3) { + if (inv) { + tcg_out_bitsel_vec(s, a0, TCG_VEC_TMP1, v4, v3); + } else { + tcg_out_bitsel_vec(s, a0, TCG_VEC_TMP1, v3, v4); + } + } else if (v3) { + if (inv) { + tcg_out_orc_vec(s, a0, v4, TCG_VEC_TMP1); + } else { + tcg_out_or_vec(s, a0, v4, TCG_VEC_TMP1); + } + } else { + if (inv) { + tcg_out_and_vec(s, a0, v4, TCG_VEC_TMP1); + } else { + tcg_out_andc_vec(s, a0, v4, TCG_VEC_TMP1); + } } - tcg_out_bitsel_vec(s, a0, TCG_VEC_TMP1, v3, v4); } static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, @@ -3944,8 +3964,8 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, tcg_out_not_vec(s, a0, a1); return; case INDEX_op_orc_vec: - insn = VORC; - break; + tcg_out_orc_vec(s, a0, a1, a2); + return; case INDEX_op_nand_vec: insn = VNAND; break; @@ -3960,7 +3980,8 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, tcg_out_cmp_vec(s, vece, a0, a1, a2, args[3]); return; case INDEX_op_cmpsel_vec: - tcg_out_cmpsel_vec(s, vece, a0, a1, a2, args[3], args[4], args[5]); + tcg_out_cmpsel_vec(s, vece, a0, a1, a2, + args[3], const_args[3], args[4], args[5]); return; case INDEX_op_bitsel_vec: tcg_out_bitsel_vec(s, a0, a1, a2, args[3]); @@ -4330,7 +4351,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) case INDEX_op_ppc_msum_vec: return C_O1_I3(v, v, v, v); case INDEX_op_cmpsel_vec: - return C_O1_I4(v, v, v, v, v); + return C_O1_I4(v, v, v, vZM, v); default: g_assert_not_reached(); From patchwork Wed Sep 11 16:50:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 662ACEE57C6 for ; Wed, 11 Sep 2024 16:52:32 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYw-0003jZ-3A; Wed, 11 Sep 2024 12:51:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYe-0002ik-6X for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:09 -0400 Received: from mail-pg1-x52a.google.com ([2607:f8b0:4864:20::52a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYc-0003hC-92 for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:07 -0400 Received: by mail-pg1-x52a.google.com with SMTP id 41be03b00d2f7-6e7b121be30so57351a12.1 for ; Wed, 11 Sep 2024 09:51:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073465; x=1726678265; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OSoJkCohCcuhNpwRCeBuz1dXYbMOq4iwACHfjeEE008=; b=X75XoWWYv88jEgYgadMb82WxpSoa0+NOONq6niD1vUimGFuxWomF6yce8wyJgFSY2T jezoYrTj/z3xMrBU87RdEb+NjHFIyZH7iUmdh12j/9LvGBi7L1CXB29jSWiQsxo4tv3a qOEMceLktnKJg8xWwKV+8UrK+b9tzLPYuMw/pDOWmuBrK54HkMuQtF2nxibcxBh7dsrD d6wocaUMTObDCx0nzpIDBaT8+YLCw3xWy0pQmCj9hck3yQYI8bW/wUR5XdT7Lcs5FpPO +W8vmyfLJLgSfi5SGqssDTxL4/iEGbNsd1FdiF4DfRhTfT5Wo315TDu9i3qlRaEmpb8k TbBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073465; x=1726678265; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OSoJkCohCcuhNpwRCeBuz1dXYbMOq4iwACHfjeEE008=; b=GcDcjKt6LsaI0j3Bkv/uwIOsEfvTM25L0/yTdrArjeZkcNKbBChPgK1htTM/+0Nppj 4UfZtFld/5huRuEZuNx6ylcgyoAY1A36Y3ohjbhtaoA9aE1EhXV36jcYZkDv/XkHfBpN /GrdyekDP95/VdskEaPwyjdLPu+AjxWI7v0lFNMzG6xLmAjo9g6RW5rKywSm41d7wY4W 5IeyQWKwqEhUcfqN/zQEvTHzWY+nCXqVxRCKb7FmxxBvrmCjROknpQIZMQaVgNCrrDVQ CPrEUOuwFtkabkRrh2jj3DPEun2PruhQmbNyahBRZ1SNFomiq9GLwnkV10HXyG+AK7Fy 0Osw== X-Gm-Message-State: AOJu0YwjjhImpKdR7AauhJif04UKJbWV4+adSCNFOSAcIR0nbGzs/zGg u96wOjf5GUIXPvAXPpeeeOvMHFfeecelzyq7j5eS/ivcOJG/fk+kum4/ixB2S2dFtfwhvezCgEQ / X-Google-Smtp-Source: AGHT+IFgmBX3Ndj1zySBA2n14pW77dWcU+KATd1FNACR29kzN9MvH3/pssIJxNyNJDpcuHu4x4EOrw== X-Received: by 2002:a05:6a21:58d:b0:1cf:2aa3:1f94 with SMTP id adf61e73a8af0-1cf5e0f80c3mr6937155637.22.1726073464862; Wed, 11 Sep 2024 09:51:04 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.51.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:51:04 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 17/18] tcg/s390x: Implement cmpsel_vec Date: Wed, 11 Sep 2024 09:50:46 -0700 Message-ID: <20240911165047.1035764-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52a; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Do not allow cmpsel_vec to be expanded early, so that we can make the correct decision wrt the sense of the comparison. Signed-off-by: Richard Henderson --- tcg/s390x/tcg-target-con-set.h | 1 + tcg/s390x/tcg-target.h | 2 +- tcg/s390x/tcg-target.c.inc | 40 ++++++++++++++++++---------------- 3 files changed, 23 insertions(+), 20 deletions(-) diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h index f75955eaa8..670089086d 100644 --- a/tcg/s390x/tcg-target-con-set.h +++ b/tcg/s390x/tcg-target-con-set.h @@ -38,6 +38,7 @@ C_O1_I2(r, rZ, r) C_O1_I2(v, v, r) C_O1_I2(v, v, v) C_O1_I3(v, v, v, v) +C_O1_I4(v, v, v, v, v) C_O1_I4(r, r, ri, rI, r) C_O1_I4(r, r, rC, rI, r) C_O2_I1(o, m, r) diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h index 62ce9d792a..86aeca166f 100644 --- a/tcg/s390x/tcg-target.h +++ b/tcg/s390x/tcg-target.h @@ -162,7 +162,7 @@ extern uint64_t s390_facilities[3]; #define TCG_TARGET_HAS_sat_vec 0 #define TCG_TARGET_HAS_minmax_vec 1 #define TCG_TARGET_HAS_bitsel_vec 1 -#define TCG_TARGET_HAS_cmpsel_vec 0 +#define TCG_TARGET_HAS_cmpsel_vec 1 #define TCG_TARGET_HAS_tst_vec 0 /* used for function call generation */ diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc index 23935fd0f0..e044168826 100644 --- a/tcg/s390x/tcg-target.c.inc +++ b/tcg/s390x/tcg-target.c.inc @@ -46,6 +46,7 @@ /* A scratch register that may be be used throughout the backend. */ #define TCG_TMP0 TCG_REG_R1 +#define TCG_VEC_TMP0 TCG_REG_V31 #define TCG_GUEST_BASE_REG TCG_REG_R13 @@ -2902,6 +2903,18 @@ static void tcg_out_cmp_vec(TCGContext *s, unsigned vece, TCGReg a0, } } +static void tcg_out_cmpsel_vec(TCGContext *s, unsigned vece, TCGReg a0, + TCGReg c1, TCGReg c2, + TCGReg v3, TCGReg v4, TCGCond cond) +{ + if (tcg_out_cmp_vec_noinv(s, vece, TCG_VEC_TMP0, c1, c2, cond)) { + TCGReg swap = v3; + v3 = v4; + v4 = swap; + } + tcg_out_insn(s, VRRe, VSEL, a0, v3, v4, TCG_VEC_TMP0); +} + static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, unsigned vece, const TCGArg args[TCG_MAX_OP_ARGS], @@ -3022,6 +3035,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_cmp_vec: tcg_out_cmp_vec(s, vece, a0, a1, a2, args[3]); break; + case INDEX_op_cmpsel_vec: + tcg_out_cmpsel_vec(s, vece, a0, a1, a2, args[3], args[4], args[5]); + break; case INDEX_op_s390_vuph_vec: tcg_out_insn(s, VRRa, VUPH, a0, a1, vece); @@ -3074,8 +3090,8 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_umin_vec: case INDEX_op_xor_vec: case INDEX_op_cmp_vec: - return 1; case INDEX_op_cmpsel_vec: + return 1; case INDEX_op_rotrv_vec: return -1; case INDEX_op_mul_vec: @@ -3088,17 +3104,6 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) } } -static void expand_vec_cmpsel(TCGType type, unsigned vece, TCGv_vec v0, - TCGv_vec c1, TCGv_vec c2, - TCGv_vec v3, TCGv_vec v4, TCGCond cond) -{ - TCGv_vec t = tcg_temp_new_vec(type); - - tcg_gen_cmp_vec(cond, vece, t, c1, c2); - tcg_gen_bitsel_vec(vece, v0, t, v3, v4); - tcg_temp_free_vec(t); -} - static void expand_vec_sat(TCGType type, unsigned vece, TCGv_vec v0, TCGv_vec v1, TCGv_vec v2, TCGOpcode add_sub_opc) { @@ -3140,7 +3145,7 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, TCGArg a0, ...) { va_list va; - TCGv_vec v0, v1, v2, v3, v4, t0; + TCGv_vec v0, v1, v2, t0; va_start(va, a0); v0 = temp_tcgv_vec(arg_temp(a0)); @@ -3148,12 +3153,6 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, v2 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); switch (opc) { - case INDEX_op_cmpsel_vec: - v3 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); - v4 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); - expand_vec_cmpsel(type, vece, v0, v1, v2, v3, v4, va_arg(va, TCGArg)); - break; - case INDEX_op_rotrv_vec: t0 = tcg_temp_new_vec(type); tcg_gen_neg_vec(vece, t0, v2); @@ -3388,6 +3387,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) return C_O1_I2(v, v, r); case INDEX_op_bitsel_vec: return C_O1_I3(v, v, v, v); + case INDEX_op_cmpsel_vec: + return C_O1_I4(v, v, v, v, v); default: g_assert_not_reached(); @@ -3512,6 +3513,7 @@ static void tcg_target_init(TCGContext *s) s->reserved_regs = 0; tcg_regset_set_reg(s->reserved_regs, TCG_TMP0); + tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP0); /* XXX many insns can't be used with R0, so we better avoid it for now */ tcg_regset_set_reg(s->reserved_regs, TCG_REG_R0); tcg_regset_set_reg(s->reserved_regs, TCG_REG_CALL_STACK); From patchwork Wed Sep 11 16:50:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13800917 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4485EEE0206 for ; Wed, 11 Sep 2024 16:53:00 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1soQYw-0003lo-2n; Wed, 11 Sep 2024 12:51:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1soQYf-0002nB-B4 for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:10 -0400 Received: from mail-pf1-x432.google.com ([2607:f8b0:4864:20::432]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1soQYd-0003hS-9x for qemu-devel@nongnu.org; Wed, 11 Sep 2024 12:51:09 -0400 Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-717934728adso5173278b3a.2 for ; Wed, 11 Sep 2024 09:51:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1726073466; x=1726678266; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nf2KKD8gpkfqLcprZUwSXAhZF2kW47U2PBUtTXC4uSQ=; b=xa8rkY1XmNjHsqmS+m3G3AzgZ5EatdbBlQe3gtSqGvfVD/Fz9UrfdmE/b6dqDFSYSp iLlLuGvy0BDC+azWKALjXdol510ZJaIl3/SUnJDDl3qzSWHH1IVqqnKKrZ8h8b5RaVZj uf2wBZNPD+ZkUYFDycNcawa/RQp12Z5bexDGxztIjaFhX/AXyrmavEddJpTdQMPj5fll 1UIEa8OS6N144KG0NrAwCaGL2grO6lE5J9vWImJT44XJmoKwRrCUpJX/YTXnDLQoVtQd yZ8PZy7XFB5J4nSiJazJnx0+3XCEBtgGug92mZHuF2cWdbAgW7Z4IUXQZr31RHf/6qcM aJQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726073466; x=1726678266; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nf2KKD8gpkfqLcprZUwSXAhZF2kW47U2PBUtTXC4uSQ=; b=p9IQj/UcDt69eHtzfhR0uSkVeQU2hAYf0xGFVBZkydKovdyEdkGZ1Tq+/Uc2+8V9eC h3PZl9ahdSccyaoeBw1f2AOF1rMPNfgnafMULLJFWf4bnv+PEMNum6LeG9eMZlOzYQOQ 5wWe2KitngVl5mtRX3hTyeS7b+3obwXf0XCgk8igTvUEhuXR3njYhJkX/QXnhvufVqhx l1OkefFxl6jwbYietk9607/+tmU4yHdbu1XEbNYbz9iB3/RfGSHxZUGhLHrEK2v8Xa2T izgDqgkZZBVcTmDzvHt4D7Eq4XCdTGwXjlsTnkNII9NjahNbGItN+zsdfkwKSGqWbJ3u mgKw== X-Gm-Message-State: AOJu0YxKgVEBcNP+f5CpxoHWA3BlYHa6MzQ4XifpgpdmfKIq8xvVNdb1 zhNTV0zHQhou4F+KBxg6Oh5IxT76UnUMdB+/PvqyG0SOnKBBvP5K4cSboAueYrjith9WpFypz8o o X-Google-Smtp-Source: AGHT+IFIRcxULGhFCZ/bAMuVohDntJldVrf8vHcuIW0wOXvPxgS9clUSNomZyvj4OVeQCrVKatyYXg== X-Received: by 2002:a05:6a00:4f90:b0:717:97ac:ef46 with SMTP id d2e1a72fcca58-7192609615amr16010b3a.15.1726073465714; Wed, 11 Sep 2024 09:51:05 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71908fe4e7esm3186947b3a.80.2024.09.11.09.51.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Sep 2024 09:51:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, philmd@linaro.org Subject: [PATCH v2 18/18] tcg/s390x: Optimize cmpsel with constant 0/-1 arguments Date: Wed, 11 Sep 2024 09:50:47 -0700 Message-ID: <20240911165047.1035764-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240911165047.1035764-1-richard.henderson@linaro.org> References: <20240911165047.1035764-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::432; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x432.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org These can be simplified to and/or/andc/orc, avoiding the load of the constantinto a register. Signed-off-by: Richard Henderson --- tcg/s390x/tcg-target-con-set.h | 3 ++- tcg/s390x/tcg-target-con-str.h | 1 + tcg/s390x/tcg-target.c.inc | 40 ++++++++++++++++++++++++++-------- 3 files changed, 34 insertions(+), 10 deletions(-) diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h index 670089086d..370e4b1295 100644 --- a/tcg/s390x/tcg-target-con-set.h +++ b/tcg/s390x/tcg-target-con-set.h @@ -38,7 +38,8 @@ C_O1_I2(r, rZ, r) C_O1_I2(v, v, r) C_O1_I2(v, v, v) C_O1_I3(v, v, v, v) -C_O1_I4(v, v, v, v, v) +C_O1_I4(v, v, v, vZ, v) +C_O1_I4(v, v, v, vZM, v) C_O1_I4(r, r, ri, rI, r) C_O1_I4(r, r, rC, rI, r) C_O2_I1(o, m, r) diff --git a/tcg/s390x/tcg-target-con-str.h b/tcg/s390x/tcg-target-con-str.h index 745f6c0df5..3e574e0662 100644 --- a/tcg/s390x/tcg-target-con-str.h +++ b/tcg/s390x/tcg-target-con-str.h @@ -20,6 +20,7 @@ CONST('C', TCG_CT_CONST_CMP) CONST('I', TCG_CT_CONST_S16) CONST('J', TCG_CT_CONST_S32) CONST('K', TCG_CT_CONST_P32) +CONST('M', TCG_CT_CONST_M1) CONST('N', TCG_CT_CONST_INV) CONST('R', TCG_CT_CONST_INVRISBG) CONST('U', TCG_CT_CONST_U32) diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc index e044168826..a5d57197a4 100644 --- a/tcg/s390x/tcg-target.c.inc +++ b/tcg/s390x/tcg-target.c.inc @@ -36,6 +36,7 @@ #define TCG_CT_CONST_INV (1 << 13) #define TCG_CT_CONST_INVRISBG (1 << 14) #define TCG_CT_CONST_CMP (1 << 15) +#define TCG_CT_CONST_M1 (1 << 16) #define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 16) #define ALL_VECTOR_REGS MAKE_64BIT_MASK(32, 32) @@ -607,6 +608,9 @@ static bool tcg_target_const_match(int64_t val, int ct, if ((ct & TCG_CT_CONST_ZERO) && val == 0) { return true; } + if ((ct & TCG_CT_CONST_M1) && val == -1) { + return true; + } if (ct & TCG_CT_CONST_INV) { val = ~val; @@ -2904,15 +2908,30 @@ static void tcg_out_cmp_vec(TCGContext *s, unsigned vece, TCGReg a0, } static void tcg_out_cmpsel_vec(TCGContext *s, unsigned vece, TCGReg a0, - TCGReg c1, TCGReg c2, - TCGReg v3, TCGReg v4, TCGCond cond) + TCGReg c1, TCGReg c2, TCGArg v3, + int const_v3, TCGReg v4, TCGCond cond) { - if (tcg_out_cmp_vec_noinv(s, vece, TCG_VEC_TMP0, c1, c2, cond)) { - TCGReg swap = v3; - v3 = v4; - v4 = swap; + bool inv = tcg_out_cmp_vec_noinv(s, vece, TCG_VEC_TMP0, c1, c2, cond); + + if (!const_v3) { + if (inv) { + tcg_out_insn(s, VRRe, VSEL, a0, v4, v3, TCG_VEC_TMP0); + } else { + tcg_out_insn(s, VRRe, VSEL, a0, v3, v4, TCG_VEC_TMP0); + } + } else if (v3) { + if (inv) { + tcg_out_insn(s, VRRc, VOC, a0, v4, TCG_VEC_TMP0, 0); + } else { + tcg_out_insn(s, VRRc, VO, a0, v4, TCG_VEC_TMP0, 0); + } + } else { + if (inv) { + tcg_out_insn(s, VRRc, VN, a0, v4, TCG_VEC_TMP0, 0); + } else { + tcg_out_insn(s, VRRc, VNC, a0, v4, TCG_VEC_TMP0, 0); + } } - tcg_out_insn(s, VRRe, VSEL, a0, v3, v4, TCG_VEC_TMP0); } static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, @@ -3036,7 +3055,8 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, tcg_out_cmp_vec(s, vece, a0, a1, a2, args[3]); break; case INDEX_op_cmpsel_vec: - tcg_out_cmpsel_vec(s, vece, a0, a1, a2, args[3], args[4], args[5]); + tcg_out_cmpsel_vec(s, vece, a0, a1, a2, args[3], const_args[3], + args[4], args[5]); break; case INDEX_op_s390_vuph_vec: @@ -3388,7 +3408,9 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) case INDEX_op_bitsel_vec: return C_O1_I3(v, v, v, v); case INDEX_op_cmpsel_vec: - return C_O1_I4(v, v, v, v, v); + return (TCG_TARGET_HAS_orc_vec + ? C_O1_I4(v, v, v, vZM, v) + : C_O1_I4(v, v, v, vZ, v)); default: g_assert_not_reached();