From patchwork Sun Sep 8 02:26:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13795306 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4884CECE579 for ; Sun, 8 Sep 2024 02:28:06 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn7dU-0003Gg-UI; Sat, 07 Sep 2024 22:26:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn7dQ-0002uU-De for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:40 -0400 Received: from mail-pj1-x102d.google.com ([2607:f8b0:4864:20::102d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn7dN-0004yq-Ol for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:40 -0400 Received: by mail-pj1-x102d.google.com with SMTP id 98e67ed59e1d1-2d87a0bfaa7so2344094a91.2 for ; Sat, 07 Sep 2024 19:26:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725762396; x=1726367196; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=scmIlN57Rm25WLgsDvFofZSXreJWo4oYdwMtH2Klqjo=; b=YNCLN8+x5H+WX05kGDckfJgwfD7Z6YLino/SQXbKLW4UO/oU7H0mWIw2pNl4ZUmSoA 4+C+fCHAskgjcJUVWs7GcNsw7coyyymAPISc0OvTtBoirTsFetQ1lD3R9MZjaExAmL3G c3sdxdWg+vV9nXWOQt1IJSQQ7UMReKdldCrMYyIhQis1jhDWt1SlvEr8rQ3O8GnppG9X gGORDdJAAYF4TrC42a4sPsctMQCg0ubimnlnw4ZjdN61YA8NN8V9/67pmN9qnzaJhA5f d7X8v8MDHnOI49YtMJ+NfqbxBjLdkDhgyqU801SQCuxr0IEtlReiUoFXfPtL4zooSgU6 OACA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725762396; x=1726367196; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=scmIlN57Rm25WLgsDvFofZSXreJWo4oYdwMtH2Klqjo=; b=Ck8el1Aeai0eRR72cTdHRPJyDekmNg/18jY/EVxGLHaQF1CRhf4aVOO98Iz2CQfsfB Bi+LVxs0NTH0MAaRz6lay55ELVei9QSHarONxYXluHsOcfjyvcUjGfDc14Lyq1v9F/hq 1WzzJfO178axs0PBsl7s08CdjmYIed9yY+gA15KHYDy/m2Vky6Z9tR8X6Fsn+NGMPbBr zRYguas8ITxiLqiOEvJpz98AHz5wSDC9p4LVTwDLrpUPnJgj4bSQz7WULqBiLnpDlW3j Ht+6lNxqLGKrCdOSNDh9xD87rQXr60PH/yndlqBJLjZx6u0JmME22LtKG1qAxRIOZdqI QKwQ== X-Gm-Message-State: AOJu0YzzqdoYuq6GN1gElM9/RuJDUePLgvvGKo+e4qagQPwe1MJ6ZAgU b5VGq9P6c6j7r0rFO3wqjyAtIhyJ5hWJssmR8kzVlIxrZR8UyXdFfDC/BktSby2x2jEvSrX4gVf X X-Google-Smtp-Source: AGHT+IFlJz+jw6RLMW3QTAwGnzyaCBP6Fhg6lSUze3oCRuvP57meC70MIx1Lv7vATdHYJCY8t224Zg== X-Received: by 2002:a17:90b:3c83:b0:2c9:6a2d:b116 with SMTP id 98e67ed59e1d1-2dad4dde0bemr7833299a91.7.1725762395626; Sat, 07 Sep 2024 19:26:35 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dadbfe46d4sm4084019a91.1.2024.09.07.19.26.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2024 19:26:35 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, liwei1518@gmail.com, bmeng.cn@gmail.com, qemu-stable@nongnu.org Subject: [PATCH 01/12] tcg: Fix iteration step in 32-bit gvec operation Date: Sat, 7 Sep 2024 19:26:21 -0700 Message-ID: <20240908022632.459477-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240908022632.459477-1-richard.henderson@linaro.org> References: <20240908022632.459477-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102d; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: TANG Tiancheng The loop in the 32-bit case of the vector compare operation was incorrectly incrementing by 8 bytes per iteration instead of 4 bytes. This caused the function to process only half of the intended elements. Cc: qemu-stable@nongnu.org Fixes: 9622c697d1 (tcg: Add gvec compare with immediate and scalar operand) Signed-off-by: TANG Tiancheng Reviewed-by: Liu Zhiwei Reviewed-by: Richard Henderson Message-ID: <20240904142739.854-2-zhiwei_liu@linux.alibaba.com> Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/tcg-op-gvec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 0308732d9b..78ee1ced80 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -3939,7 +3939,7 @@ void tcg_gen_gvec_cmps(TCGCond cond, unsigned vece, uint32_t dofs, uint32_t i; tcg_gen_extrl_i64_i32(t1, c); - for (i = 0; i < oprsz; i += 8) { + for (i = 0; i < oprsz; i += 4) { tcg_gen_ld_i32(t0, tcg_env, aofs + i); tcg_gen_negsetcond_i32(cond, t0, t0, t1); tcg_gen_st_i32(t0, tcg_env, dofs + i); From patchwork Sun Sep 8 02:26:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13795314 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 75D79ECE579 for ; Sun, 8 Sep 2024 02:28:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn7dW-0003MN-2Z; Sat, 07 Sep 2024 22:26:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn7dQ-0002v1-IB for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:40 -0400 Received: from mail-pg1-x535.google.com ([2607:f8b0:4864:20::535]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn7dO-0004yz-8W for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:40 -0400 Received: by mail-pg1-x535.google.com with SMTP id 41be03b00d2f7-7d4f85766f0so2617798a12.2 for ; Sat, 07 Sep 2024 19:26:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725762397; x=1726367197; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=N9N2cgqqLVwUjXchNHwMUTNObW7o0Ba49NXy9nXuu1w=; b=A3YSkF5LCUtbANmlNWdSzkrxozk0t/9aPzRUop+r7habCs23kqMDXIQ/KjYy/3UnDG 4BOsjS7aSvZgYuvnHOXZ7Lr0qxLKK2meTW/E4FiBq36FMWM8QjYKPXTDPml8tH7BQWkB YAo+pipgS2/lQS3fgAJUbX3WAhvmugjq9bJs472NGLOSn8oDzuqjKm0ts0CkQQKeXRLP Fw8Wi0yLiQZRnqWxLrDdQ/X6z6k3KGUdNeYDNdd8ZtdoVFwfC4eo6UwRnsZteUbRaTr/ uszV1abY+/1dj/pB/hYyCDScVh3G8d25JiDMio/tiKU1X8WLN9O2oYfmirESh0O8ME9J 2h3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725762397; x=1726367197; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=N9N2cgqqLVwUjXchNHwMUTNObW7o0Ba49NXy9nXuu1w=; b=FCkDQ8n1a8efRju9wdMv5oFUAfxryI9nentvfQf6uxjEaXVsCqVENK8BS7xDpBDpwf hH9GaFAgW1Gcr1N4byOieUPAlyx/DBS8dtYdIftmd8NZLX3f0KPBE/Z19LnIUG6sa+wp PxBvMPxnVpJY3/JNZuW9JhPe2nVo6mx8wiGzYrs6B30OvJOVugvyLgv2ZHHLtOKAC4VS vvFxL/Syt4+evaxdoEcwYaRQ09D4kHNnQiLfnb/qx3FQw1yc2bVH58qswAledz6hNtNk yPw/AXGBlR27V6dfM57pwz4O39TK+wHM7tvO1jq0Kxlb8CGbmhb7Bq/xTqoM+bjZh5Rp n6Uw== X-Gm-Message-State: AOJu0YzScca6AHgPNHk2GO3DJoeWLiXhSM7ULpxr11HW9UPwUAWv6JRo YLtP5Nez2LlAwk4If1RYtfiH2LrUD0NrdGXV4X6HR2KZkbtsGRnKQ/DmmtC3vIaB22CXcIRmofd 2 X-Google-Smtp-Source: AGHT+IEVn1e16lyfZmdOzxxWo89HzA40Sh2xcZsMYkbIHkpK0/0awh42QGPlK3rwhHrRppx1MsFTTw== X-Received: by 2002:a05:6a20:d510:b0:1ce:d418:a45c with SMTP id adf61e73a8af0-1cf2a0fcd7cmr4092480637.50.1725762396531; Sat, 07 Sep 2024 19:26:36 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dadbfe46d4sm4084019a91.1.2024.09.07.19.26.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2024 19:26:36 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, liwei1518@gmail.com, bmeng.cn@gmail.com Subject: [PATCH 02/12] tcg: Export vec_gen_6 Date: Sat, 7 Sep 2024 19:26:22 -0700 Message-ID: <20240908022632.459477-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240908022632.459477-1-richard.henderson@linaro.org> References: <20240908022632.459477-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::535; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x535.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Add declaration to tcg-internal.h, making it available for use from tcg backend vector expanders. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/tcg-internal.h | 2 ++ tcg/tcg-op-vec.c | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h index 9b0d982f65..52103f4164 100644 --- a/tcg/tcg-internal.h +++ b/tcg/tcg-internal.h @@ -102,5 +102,7 @@ void tcg_gen_op6(TCGOpcode, TCGArg, TCGArg, TCGArg, TCGArg, TCGArg, TCGArg); void vec_gen_2(TCGOpcode, TCGType, unsigned, TCGArg, TCGArg); void vec_gen_3(TCGOpcode, TCGType, unsigned, TCGArg, TCGArg, TCGArg); void vec_gen_4(TCGOpcode, TCGType, unsigned, TCGArg, TCGArg, TCGArg, TCGArg); +void vec_gen_6(TCGOpcode opc, TCGType type, unsigned vece, TCGArg r, + TCGArg a, TCGArg b, TCGArg c, TCGArg d, TCGArg e); #endif /* TCG_INTERNAL_H */ diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index 84af210bc0..d4bb4aee74 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -172,8 +172,8 @@ void vec_gen_4(TCGOpcode opc, TCGType type, unsigned vece, op->args[3] = c; } -static void vec_gen_6(TCGOpcode opc, TCGType type, unsigned vece, TCGArg r, - TCGArg a, TCGArg b, TCGArg c, TCGArg d, TCGArg e) +void vec_gen_6(TCGOpcode opc, TCGType type, unsigned vece, TCGArg r, + TCGArg a, TCGArg b, TCGArg c, TCGArg d, TCGArg e) { TCGOp *op = tcg_emit_op(opc, 6); TCGOP_VECL(op) = type - TCG_TYPE_V64; From patchwork Sun Sep 8 02:26:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13795304 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A306FECE579 for ; Sun, 8 Sep 2024 02:27:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn7dS-00035g-MD; Sat, 07 Sep 2024 22:26:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn7dQ-0002vp-N3 for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:40 -0400 Received: from mail-pj1-x1034.google.com ([2607:f8b0:4864:20::1034]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn7dO-0004zK-T2 for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:40 -0400 Received: by mail-pj1-x1034.google.com with SMTP id 98e67ed59e1d1-2d86f713557so2189657a91.2 for ; Sat, 07 Sep 2024 19:26:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725762397; x=1726367197; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=u46fJy6WD0XvoquSXEHKk0WznnV96a8saQQEZkHCpG4=; b=NrRl0Vk/vOws/9rhkUoD35GlqoGcZOJg2Ac5p4UHBX1oA/gGgnxrHoZRNW1fwHjul2 Op65kOh9M+xIM/9n4kBiUSnmmyFCdh4wg47kOYMJPK/kH+a/mV+b8Nl6XawVjsS1p/CK kSuY1Yblw1FYzENtpbbdF5oaJx0FlTvuvEOfhQyjBst055Av9gUwBoNF4fv2cQDLvYA5 yx81TQR55MHmPWa5b1rSANM6+W4nKEhqrFzIS0x6GooMMBu4G6jWmyPQn06IkArxc2lP Qk5iKHkQVnbD/E8lAdm55AfK2OgfhpK0ij6nD3Ebpm1udtAgUaPSEEdvQq2hOR1e9XjP mj8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725762397; x=1726367197; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u46fJy6WD0XvoquSXEHKk0WznnV96a8saQQEZkHCpG4=; b=tUJYakRauqt4MjjRQkL1aalrIMfuxrAHC8NGtzp35hbKukfCyORU4xFyz1Zhybx0mP lLtKYxjQyFNluAoXzdqDbNDMNojY1sBhvtYPo0ljz8i6pPQOmdHDY+tFqJNJsP0m4zUf mboIt6VaItvcxL863wUx7pwhz27FVt78u/kCRmXOXQaUqJfWTEDUnPxyhWN9IuJjKu0v 6cvd/Li2ahqk/HRQY/DZ8tqCrPIExzfiY8+gAjkbiJDkI9652zBpPewgz+4nxxf40mdY uWFBTd8c+xJctOfGrtCjpkVJrKeaVpcfd7rPSXCKtDeMzSfz1R3kCJuyqgZz6IUmzebg hnnQ== X-Gm-Message-State: AOJu0Yyq19oQoeFUAxNpMdz69x4py90hJqYMmnv+IWH6jE9xyG0vMwIC TmzocBmAXb+ugkPt+8cfVCGVKIK4l0pEYoknlDbQMJ358i8NTLci+IRcaMq1Q++eGUsDpCccMfu q X-Google-Smtp-Source: AGHT+IF/W4qQhs7hNRbjoEvOQXDrCsMvCTUHNERVVX7dFl/867OxoXeScJklKC4IGxGsR/Hv9Xn05A== X-Received: by 2002:a17:90a:77ca:b0:2c9:df1c:4a58 with SMTP id 98e67ed59e1d1-2dad5018dbfmr8906709a91.23.1725762397471; Sat, 07 Sep 2024 19:26:37 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dadbfe46d4sm4084019a91.1.2024.09.07.19.26.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2024 19:26:37 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, liwei1518@gmail.com, bmeng.cn@gmail.com Subject: [PATCH 03/12] tcg/i386: Split out tcg_out_vex_modrm_type Date: Sat, 7 Sep 2024 19:26:23 -0700 Message-ID: <20240908022632.459477-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240908022632.459477-1-richard.henderson@linaro.org> References: <20240908022632.459477-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1034; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1034.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Helper function to handle setting of VEXL based on the type of the operation. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/i386/tcg-target.c.inc | 38 +++++++++++++++----------------------- 1 file changed, 15 insertions(+), 23 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 9a54ef7f8d..af71a397b1 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -711,6 +711,15 @@ static void tcg_out_vex_modrm(TCGContext *s, int opc, int r, int v, int rm) tcg_out8(s, 0xc0 | (LOWREGMASK(r) << 3) | LOWREGMASK(rm)); } +static void tcg_out_vex_modrm_type(TCGContext *s, int opc, + int r, int v, int rm, TCGType type) +{ + if (type == TCG_TYPE_V256) { + opc |= P_VEXL; + } + tcg_out_vex_modrm(s, opc, r, v, rm); +} + /* Output an opcode with a full "rm + (index< X-Patchwork-Id: 13795309 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A7F1DECE579 for ; Sun, 8 Sep 2024 02:28:11 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn7dT-000385-4d; Sat, 07 Sep 2024 22:26:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn7dS-000336-2T for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:42 -0400 Received: from mail-oo1-xc31.google.com ([2607:f8b0:4864:20::c31]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn7dQ-00050H-2Y for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:41 -0400 Received: by mail-oo1-xc31.google.com with SMTP id 006d021491bc7-5e1ba0adcb0so774233eaf.0 for ; Sat, 07 Sep 2024 19:26:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725762399; x=1726367199; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tnL0glGXZj+IRh5HnNxFKu2Xl8Anqsqr5rwYjln/1tc=; b=wRVqPGNNH3MLA2v/gqeS8f8a7o6/MVlTOr53oTtsAn3Fqy4eQhLDnOOLSiVpAcGI3E hJuVfTTqaBof9otJhy2/rxRKCeXOiPjpSl9aXZKW4E1RidDRbPtcoEyxFG5Nb1r51Jo5 vl+3raX6dwUYQ64iwMO19W4UTzuIAbSVmWq5SS/FUSBmrn1GVjdJXm6utaHDIt+iIb4i s5dRIecZ0POt6NSZHEKaXiy3cmcD/8fnOgqXVvpnBrTkXucfQuyly+X1iOOLkLHqsJbU m9CVVcIWvNLPy1exxcQOBn4nvYoiC9h551q5aaFAQ95clt/BhlYND+McBe52MC0dxE4z 2YfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725762399; x=1726367199; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tnL0glGXZj+IRh5HnNxFKu2Xl8Anqsqr5rwYjln/1tc=; b=fgLX0OjFGnD7lR3xltoh9KVX4WNJHfWSrG8S+3FAslqY2F29YiihJ3yNQ0v7kxeNTA 7yTjSX3D2rBV9R6tVXIIN67qadHYh7SH4xryVcDwHZVkOlfls7Pdq/sZ2VeeGYxG4o0t dB8UbyhqgxUEndWP7D2F/UZdPYz3cKDQ/GCYs0OhvACeyJzmLjwi+zCgkYJPSFyYvhU6 Fzx0W/MDAkeRI4h9E+u7iNXymCgIw80rwzy0DkqLyfs+RCkkUBA4M4dRCRt5NPBi99zW CNX735M47iGSL1VuVHq2kEwL5ot6mHsY0xSyXnT39byoFHuHhj0fb99az7gA+7+W6QGV GziQ== X-Gm-Message-State: AOJu0YyvWAcy6IURDdGw2Jp4ICPxa9gkSN1bg7UobAwftNhoKeEcyZH/ ai8qbmxRWEmBXy2ke1LYJBg/cE0zdlrf+T04/Srm99libieQUlR/CjQjs2ytuQMmjysd/UUyhrr U X-Google-Smtp-Source: AGHT+IHeNcIoPywnqaqhHmHeaHsiVVIhW4r9rLYGnu8/a1VV5I8a5Ri1lRYJwNP3z3rIEEkLwQHKew== X-Received: by 2002:a05:6830:90e:b0:70f:36ff:ed09 with SMTP id 46e09a7af769-710cc26e87emr8453710a34.28.1725762398510; Sat, 07 Sep 2024 19:26:38 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dadbfe46d4sm4084019a91.1.2024.09.07.19.26.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2024 19:26:38 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, liwei1518@gmail.com, bmeng.cn@gmail.com Subject: [PATCH 04/12] tcg/i386: Do not expand cmp_vec early Date: Sat, 7 Sep 2024 19:26:24 -0700 Message-ID: <20240908022632.459477-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240908022632.459477-1-richard.henderson@linaro.org> References: <20240908022632.459477-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::c31; envelope-from=richard.henderson@linaro.org; helo=mail-oo1-xc31.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Move most of expansion to opcode generation, leaving the conversion of unsigned to signed to be done in the early phase. Small inefficiencies, but not incorrect results, are introduced until cmpsel_vec is converted in the next patch. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 223 +++++++++++++++++--------------------- 1 file changed, 100 insertions(+), 123 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index af71a397b1..278e567b56 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -3029,6 +3029,92 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, #undef OP_32_64 } +static int const umin_insn[4] = { + OPC_PMINUB, OPC_PMINUW, OPC_PMINUD, OPC_VPMINUQ +}; + +static int const umax_insn[4] = { + OPC_PMAXUB, OPC_PMAXUW, OPC_PMAXUD, OPC_VPMAXUQ +}; + +static bool tcg_out_cmp_vec_noinv(TCGContext *s, TCGType type, unsigned vece, + TCGReg v0, TCGReg v1, TCGReg v2, TCGCond cond) +{ + static int const cmpeq_insn[4] = { + OPC_PCMPEQB, OPC_PCMPEQW, OPC_PCMPEQD, OPC_PCMPEQQ + }; + static int const cmpgt_insn[4] = { + OPC_PCMPGTB, OPC_PCMPGTW, OPC_PCMPGTD, OPC_PCMPGTQ + }; + + enum { + NEED_INV = 1, + NEED_SWAP = 2, + NEED_UMIN = 4, + NEED_UMAX = 8, + INVALID = 16, + }; + static const uint8_t cond_fixup[16] = { + [0 ... 15] = INVALID, + [TCG_COND_EQ] = 0, + [TCG_COND_GT] = 0, + [TCG_COND_NE] = NEED_INV, + [TCG_COND_LE] = NEED_INV, + [TCG_COND_LT] = NEED_SWAP, + [TCG_COND_GE] = NEED_SWAP | NEED_INV, + [TCG_COND_LEU] = NEED_UMIN, + [TCG_COND_GTU] = NEED_UMIN | NEED_INV, + [TCG_COND_GEU] = NEED_UMAX, + [TCG_COND_LTU] = NEED_UMAX | NEED_INV, + }; + int fixup = cond_fixup[cond]; + + assert(!(fixup & INVALID)); + + if (fixup & NEED_INV) { + cond = tcg_invert_cond(cond); + } + + if (fixup & NEED_SWAP) { + TCGReg swap = v1; + v1 = v2; + v2 = swap; + cond = tcg_swap_cond(cond); + } + + if (fixup & (NEED_UMIN | NEED_UMAX)) { + int op = (fixup & NEED_UMIN ? umin_insn[vece] : umax_insn[vece]); + + /* avx2 does not have 64-bit min/max; adjusted during expand. */ + assert(vece <= MO_32); + + tcg_out_vex_modrm_type(s, op, TCG_TMP_VEC, v1, v2, type); + v2 = TCG_TMP_VEC; + cond = TCG_COND_EQ; + } + + switch (cond) { + case TCG_COND_EQ: + tcg_out_vex_modrm_type(s, cmpeq_insn[vece], v0, v1, v2, type); + break; + case TCG_COND_GT: + tcg_out_vex_modrm_type(s, cmpgt_insn[vece], v0, v1, v2, type); + break; + default: + g_assert_not_reached(); + } + return fixup & NEED_INV; +} + +static void tcg_out_cmp_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg v0, TCGReg v1, TCGReg v2, TCGCond cond) +{ + if (tcg_out_cmp_vec_noinv(s, type, vece, v0, v1, v2, cond)) { + tcg_out_dupi_vec(s, type, vece, TCG_TMP_VEC, -1); + tcg_out_vex_modrm_type(s, OPC_PXOR, v0, v0, TCG_TMP_VEC, type); + } +} + static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, unsigned vece, const TCGArg args[TCG_MAX_OP_ARGS], @@ -3058,12 +3144,6 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, static int const shift_imm_insn[4] = { OPC_UD2, OPC_PSHIFTW_Ib, OPC_PSHIFTD_Ib, OPC_PSHIFTQ_Ib }; - static int const cmpeq_insn[4] = { - OPC_PCMPEQB, OPC_PCMPEQW, OPC_PCMPEQD, OPC_PCMPEQQ - }; - static int const cmpgt_insn[4] = { - OPC_PCMPGTB, OPC_PCMPGTW, OPC_PCMPGTD, OPC_PCMPGTQ - }; static int const punpckl_insn[4] = { OPC_PUNPCKLBW, OPC_PUNPCKLWD, OPC_PUNPCKLDQ, OPC_PUNPCKLQDQ }; @@ -3082,12 +3162,6 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, static int const smax_insn[4] = { OPC_PMAXSB, OPC_PMAXSW, OPC_PMAXSD, OPC_VPMAXSQ }; - static int const umin_insn[4] = { - OPC_PMINUB, OPC_PMINUW, OPC_PMINUD, OPC_VPMINUQ - }; - static int const umax_insn[4] = { - OPC_PMAXUB, OPC_PMAXUW, OPC_PMAXUD, OPC_VPMAXUQ - }; static int const rotlv_insn[4] = { OPC_UD2, OPC_UD2, OPC_VPROLVD, OPC_VPROLVQ }; @@ -3243,15 +3317,8 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, break; case INDEX_op_cmp_vec: - sub = args[3]; - if (sub == TCG_COND_EQ) { - insn = cmpeq_insn[vece]; - } else if (sub == TCG_COND_GT) { - insn = cmpgt_insn[vece]; - } else { - g_assert_not_reached(); - } - goto gen_simd; + tcg_out_cmp_vec(s, type, vece, a0, a1, a2, args[3]); + break; case INDEX_op_andc_vec: insn = OPC_PANDN; @@ -3971,88 +4038,19 @@ static void expand_vec_mul(TCGType type, unsigned vece, } } -static bool expand_vec_cmp_noinv(TCGType type, unsigned vece, TCGv_vec v0, - TCGv_vec v1, TCGv_vec v2, TCGCond cond) +static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, + TCGv_vec v1, TCGv_vec v2, TCGCond cond) { - enum { - NEED_INV = 1, - NEED_SWAP = 2, - NEED_BIAS = 4, - NEED_UMIN = 8, - NEED_UMAX = 16, - }; - TCGv_vec t1, t2, t3; - uint8_t fixup; + /* + * Without AVX512, there are no 64-bit unsigned comparisons. + * We must bias the inputs so that they become signed. + * All other swapping and inversion are handled during code generation. + */ + if (vece == MO_64 && is_unsigned_cond(cond)) { + TCGv_vec t1 = tcg_temp_new_vec(type); + TCGv_vec t2 = tcg_temp_new_vec(type); + TCGv_vec t3 = tcg_constant_vec(type, vece, 1ull << ((8 << vece) - 1)); - switch (cond) { - case TCG_COND_EQ: - case TCG_COND_GT: - fixup = 0; - break; - case TCG_COND_NE: - case TCG_COND_LE: - fixup = NEED_INV; - break; - case TCG_COND_LT: - fixup = NEED_SWAP; - break; - case TCG_COND_GE: - fixup = NEED_SWAP | NEED_INV; - break; - case TCG_COND_LEU: - if (tcg_can_emit_vec_op(INDEX_op_umin_vec, type, vece)) { - fixup = NEED_UMIN; - } else { - fixup = NEED_BIAS | NEED_INV; - } - break; - case TCG_COND_GTU: - if (tcg_can_emit_vec_op(INDEX_op_umin_vec, type, vece)) { - fixup = NEED_UMIN | NEED_INV; - } else { - fixup = NEED_BIAS; - } - break; - case TCG_COND_GEU: - if (tcg_can_emit_vec_op(INDEX_op_umax_vec, type, vece)) { - fixup = NEED_UMAX; - } else { - fixup = NEED_BIAS | NEED_SWAP | NEED_INV; - } - break; - case TCG_COND_LTU: - if (tcg_can_emit_vec_op(INDEX_op_umax_vec, type, vece)) { - fixup = NEED_UMAX | NEED_INV; - } else { - fixup = NEED_BIAS | NEED_SWAP; - } - break; - default: - g_assert_not_reached(); - } - - if (fixup & NEED_INV) { - cond = tcg_invert_cond(cond); - } - if (fixup & NEED_SWAP) { - t1 = v1, v1 = v2, v2 = t1; - cond = tcg_swap_cond(cond); - } - - t1 = t2 = NULL; - if (fixup & (NEED_UMIN | NEED_UMAX)) { - t1 = tcg_temp_new_vec(type); - if (fixup & NEED_UMIN) { - tcg_gen_umin_vec(vece, t1, v1, v2); - } else { - tcg_gen_umax_vec(vece, t1, v1, v2); - } - v2 = t1; - cond = TCG_COND_EQ; - } else if (fixup & NEED_BIAS) { - t1 = tcg_temp_new_vec(type); - t2 = tcg_temp_new_vec(type); - t3 = tcg_constant_vec(type, vece, 1ull << ((8 << vece) - 1)); tcg_gen_sub_vec(vece, t1, v1, t3); tcg_gen_sub_vec(vece, t2, v2, t3); v1 = t1; @@ -4060,26 +4058,9 @@ static bool expand_vec_cmp_noinv(TCGType type, unsigned vece, TCGv_vec v0, cond = tcg_signed_cond(cond); } - tcg_debug_assert(cond == TCG_COND_EQ || cond == TCG_COND_GT); /* Expand directly; do not recurse. */ vec_gen_4(INDEX_op_cmp_vec, type, vece, tcgv_vec_arg(v0), tcgv_vec_arg(v1), tcgv_vec_arg(v2), cond); - - if (t1) { - tcg_temp_free_vec(t1); - if (t2) { - tcg_temp_free_vec(t2); - } - } - return fixup & NEED_INV; -} - -static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, - TCGv_vec v1, TCGv_vec v2, TCGCond cond) -{ - if (expand_vec_cmp_noinv(type, vece, v0, v1, v2, cond)) { - tcg_gen_not_vec(vece, v0, v0); - } } static void expand_vec_cmpsel(TCGType type, unsigned vece, TCGv_vec v0, @@ -4088,11 +4069,7 @@ static void expand_vec_cmpsel(TCGType type, unsigned vece, TCGv_vec v0, { TCGv_vec t = tcg_temp_new_vec(type); - if (expand_vec_cmp_noinv(type, vece, t, c1, c2, cond)) { - /* Invert the sense of the compare by swapping arguments. */ - TCGv_vec x; - x = v3, v3 = v4, v4 = x; - } + expand_vec_cmp(type, vece, t, c1, c2, cond); vec_gen_4(INDEX_op_x86_vpblendvb_vec, type, vece, tcgv_vec_arg(v0), tcgv_vec_arg(v4), tcgv_vec_arg(v3), tcgv_vec_arg(t)); From patchwork Sun Sep 8 02:26:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13795313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 022C8ECE579 for ; Sun, 8 Sep 2024 02:28:28 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn7dV-0003ID-7l; Sat, 07 Sep 2024 22:26:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn7dS-00037c-W8 for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:43 -0400 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn7dQ-00050T-UE for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:42 -0400 Received: by mail-pj1-x1032.google.com with SMTP id 98e67ed59e1d1-2d89229ac81so2563991a91.0 for ; Sat, 07 Sep 2024 19:26:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725762400; x=1726367200; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/08vmd3pJmcPxGI9Dvd5g1DO6IpTA5tvF/u1ewyB0/g=; b=ZwcQSX6hYoooENeOkNk+rGAFIrHe7ICOmu/nwW3D5or/ppm73uzzz7eD6pjqnxgy2X kzPilsGxzaLFoOUh8+pQ2Guwnt31ODE4l9kCIv3zL0zk32WaWuP+SUrDgDkslOJUWmpH fD9AGwZQPXymsmhYeqQj3yq/2gh172xFKlQBeKeA44ENKjc06cwukRayXoMIecobB2R5 b/sSfMD+8KWgHYbllzn4NNSlgrfaYHcs+uZlyeyVm/4SRhXBdXhZbjbkxHRQFJ/d2meb VenJFeSSUSekheNxkVM2J14uUy9nWp9shCe6qhjZLiiT+IOln14yK6rl5UrXOk1GB6gQ TyLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725762400; x=1726367200; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/08vmd3pJmcPxGI9Dvd5g1DO6IpTA5tvF/u1ewyB0/g=; b=cncUt62AO07tEH4sAOamT/xYdBr++k/bKO7gUnWjbXNHHG4VD/eMCDnaKyh6yUHEt2 Lgf9QheOw+1o98aOsSEOGgcrCaoCw4DmVmxyytsTe1xWk24m/+oJHDBEZmSKMZdnlgsU n2PRijNmXLBMIN78pGArI0/KZMXybeO4jn5563oSHeOdxdmjFImYzL6mlAFLQOqZu49e YLCwvTbrVPksM1W1YB9xDhN5Cu7xFZn3LFkEpPP2ZS6fEZza6SQaI6TWAQd4h8jsxMb+ TuaWPW/66ttV+gUYHTSg02ysiCkEdnlH/c6khDWlFHiB1vkHxjKnTAbHfYhA+Dxz2r+b BCPw== X-Gm-Message-State: AOJu0Yz8yH88ULAIHSFvq8qLoytKg2ZXcHcJ5K3U8zcMishurjfczPQB 1ztaFF7gH1bzhRugDLyTLEvd1StWrgVh6axBiiN+pRABJdOYB9OMm5XBH8EWHfPpr7sWWgvkVPz B X-Google-Smtp-Source: AGHT+IExLPw53HHUSVt5VYdyPs0TuPFN+qapmf43dDBeVnfCwItHNuWzkRH1oxbAmcG8oCNs+nr7CQ== X-Received: by 2002:a17:90b:3b8f:b0:2cf:f3e9:d5c8 with SMTP id 98e67ed59e1d1-2dad50d0d8amr8830659a91.31.1725762399523; Sat, 07 Sep 2024 19:26:39 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dadbfe46d4sm4084019a91.1.2024.09.07.19.26.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2024 19:26:39 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, liwei1518@gmail.com, bmeng.cn@gmail.com Subject: [PATCH 05/12] tcg/i386: Do not expand cmpsel_vec early Date: Sat, 7 Sep 2024 19:26:25 -0700 Message-ID: <20240908022632.459477-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240908022632.459477-1-richard.henderson@linaro.org> References: <20240908022632.459477-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1032; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1032.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Expand during output instead of during opcode generation. Remove x86_vpblendvb_vec opcode, this this removes the only user. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target-con-set.h | 1 + tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.opc.h | 1 - tcg/i386/tcg-target.c.inc | 84 +++++++++++++++++++++-------------- 4 files changed, 53 insertions(+), 35 deletions(-) diff --git a/tcg/i386/tcg-target-con-set.h b/tcg/i386/tcg-target-con-set.h index e24241cfa2..da4411d96b 100644 --- a/tcg/i386/tcg-target-con-set.h +++ b/tcg/i386/tcg-target-con-set.h @@ -50,6 +50,7 @@ C_N1_I2(r, r, r) C_N1_I2(r, r, rW) C_O1_I3(x, 0, x, x) C_O1_I3(x, x, x, x) +C_O1_I4(x, x, x, x, x) C_O1_I4(r, r, reT, r, 0) C_O1_I4(r, r, r, ri, ri) C_O2_I1(r, r, L) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 2f67a97e05..342be30c4c 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -223,7 +223,7 @@ typedef enum { #define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 1 #define TCG_TARGET_HAS_bitsel_vec have_avx512vl -#define TCG_TARGET_HAS_cmpsel_vec -1 +#define TCG_TARGET_HAS_cmpsel_vec 1 #define TCG_TARGET_HAS_tst_vec 0 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ diff --git a/tcg/i386/tcg-target.opc.h b/tcg/i386/tcg-target.opc.h index b5f403e35e..4ffc084bda 100644 --- a/tcg/i386/tcg-target.opc.h +++ b/tcg/i386/tcg-target.opc.h @@ -25,7 +25,6 @@ */ DEF(x86_shufps_vec, 1, 2, 1, IMPLVEC) -DEF(x86_vpblendvb_vec, 1, 3, 0, IMPLVEC) DEF(x86_blend_vec, 1, 2, 1, IMPLVEC) DEF(x86_packss_vec, 1, 2, 0, IMPLVEC) DEF(x86_packus_vec, 1, 2, 0, IMPLVEC) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 278e567b56..a04dc7d270 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -3115,6 +3115,19 @@ static void tcg_out_cmp_vec(TCGContext *s, TCGType type, unsigned vece, } } +static void tcg_out_cmpsel_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg v0, TCGReg c1, TCGReg c2, + TCGReg v3, TCGReg v4, TCGCond cond) +{ + if (tcg_out_cmp_vec_noinv(s, type, vece, TCG_TMP_VEC, c1, c2, cond)) { + TCGReg swap = v3; + v3 = v4; + v4 = swap; + } + tcg_out_vex_modrm_type(s, OPC_VPBLENDVB, v0, v4, v3, type); + tcg_out8(s, (TCG_TMP_VEC - TCG_REG_XMM0) << 4); +} + static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, unsigned vece, const TCGArg args[TCG_MAX_OP_ARGS], @@ -3320,6 +3333,11 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, tcg_out_cmp_vec(s, type, vece, a0, a1, a2, args[3]); break; + case INDEX_op_cmpsel_vec: + tcg_out_cmpsel_vec(s, type, vece, a0, a1, a2, + args[3], args[4], args[5]); + break; + case INDEX_op_andc_vec: insn = OPC_PANDN; tcg_out_vex_modrm_type(s, insn, a0, a2, a1, type); @@ -3431,11 +3449,6 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, tcg_out8(s, sub); break; - case INDEX_op_x86_vpblendvb_vec: - tcg_out_vex_modrm_type(s, OPC_VPBLENDVB, a0, a1, a2, type); - tcg_out8(s, args[3] << 4); - break; - case INDEX_op_x86_psrldq_vec: tcg_out_vex_modrm(s, OPC_GRP14, 3, a0, a1); tcg_out8(s, a2); @@ -3701,8 +3714,9 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) return C_O1_I3(x, 0, x, x); case INDEX_op_bitsel_vec: - case INDEX_op_x86_vpblendvb_vec: return C_O1_I3(x, x, x, x); + case INDEX_op_cmpsel_vec: + return C_O1_I4(x, x, x, x, x); default: g_assert_not_reached(); @@ -4038,8 +4052,8 @@ static void expand_vec_mul(TCGType type, unsigned vece, } } -static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, - TCGv_vec v1, TCGv_vec v2, TCGCond cond) +static TCGCond expand_vec_cond(TCGType type, unsigned vece, + TCGArg *a1, TCGArg *a2, TCGCond cond) { /* * Without AVX512, there are no 64-bit unsigned comparisons. @@ -4047,46 +4061,50 @@ static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, * All other swapping and inversion are handled during code generation. */ if (vece == MO_64 && is_unsigned_cond(cond)) { + TCGv_vec v1 = temp_tcgv_vec(arg_temp(*a1)); + TCGv_vec v2 = temp_tcgv_vec(arg_temp(*a2)); TCGv_vec t1 = tcg_temp_new_vec(type); TCGv_vec t2 = tcg_temp_new_vec(type); TCGv_vec t3 = tcg_constant_vec(type, vece, 1ull << ((8 << vece) - 1)); tcg_gen_sub_vec(vece, t1, v1, t3); tcg_gen_sub_vec(vece, t2, v2, t3); - v1 = t1; - v2 = t2; + *a1 = tcgv_vec_arg(t1); + *a2 = tcgv_vec_arg(t2); cond = tcg_signed_cond(cond); } - - /* Expand directly; do not recurse. */ - vec_gen_4(INDEX_op_cmp_vec, type, vece, - tcgv_vec_arg(v0), tcgv_vec_arg(v1), tcgv_vec_arg(v2), cond); + return cond; } -static void expand_vec_cmpsel(TCGType type, unsigned vece, TCGv_vec v0, - TCGv_vec c1, TCGv_vec c2, - TCGv_vec v3, TCGv_vec v4, TCGCond cond) +static void expand_vec_cmp(TCGType type, unsigned vece, TCGArg a0, + TCGArg a1, TCGArg a2, TCGCond cond) { - TCGv_vec t = tcg_temp_new_vec(type); + cond = expand_vec_cond(type, vece, &a1, &a2, cond); + /* Expand directly; do not recurse. */ + vec_gen_4(INDEX_op_cmp_vec, type, vece, a0, a1, a2, cond); +} - expand_vec_cmp(type, vece, t, c1, c2, cond); - vec_gen_4(INDEX_op_x86_vpblendvb_vec, type, vece, - tcgv_vec_arg(v0), tcgv_vec_arg(v4), - tcgv_vec_arg(v3), tcgv_vec_arg(t)); - tcg_temp_free_vec(t); +static void expand_vec_cmpsel(TCGType type, unsigned vece, TCGArg a0, + TCGArg a1, TCGArg a2, + TCGArg a3, TCGArg a4, TCGCond cond) +{ + cond = expand_vec_cond(type, vece, &a1, &a2, cond); + /* Expand directly; do not recurse. */ + vec_gen_6(INDEX_op_cmpsel_vec, type, vece, a0, a1, a2, a3, a4, cond); } void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, TCGArg a0, ...) { va_list va; - TCGArg a2; - TCGv_vec v0, v1, v2, v3, v4; + TCGArg a1, a2, a3, a4, a5; + TCGv_vec v0, v1, v2; va_start(va, a0); - v0 = temp_tcgv_vec(arg_temp(a0)); - v1 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); + a1 = va_arg(va, TCGArg); a2 = va_arg(va, TCGArg); + v0 = temp_tcgv_vec(arg_temp(a0)); + v1 = temp_tcgv_vec(arg_temp(a1)); switch (opc) { case INDEX_op_shli_vec: @@ -4122,15 +4140,15 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, break; case INDEX_op_cmp_vec: - v2 = temp_tcgv_vec(arg_temp(a2)); - expand_vec_cmp(type, vece, v0, v1, v2, va_arg(va, TCGArg)); + a3 = va_arg(va, TCGArg); + expand_vec_cmp(type, vece, a0, a1, a2, a3); break; case INDEX_op_cmpsel_vec: - v2 = temp_tcgv_vec(arg_temp(a2)); - v3 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); - v4 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); - expand_vec_cmpsel(type, vece, v0, v1, v2, v3, v4, va_arg(va, TCGArg)); + a3 = va_arg(va, TCGArg); + a4 = va_arg(va, TCGArg); + a5 = va_arg(va, TCGArg); + expand_vec_cmpsel(type, vece, a0, a1, a2, a3, a4, a5); break; default: From patchwork Sun Sep 8 02:26:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13795316 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BF28EECE577 for ; Sun, 8 Sep 2024 02:29:05 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn7dW-0003Mj-5f; Sat, 07 Sep 2024 22:26:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn7dT-0003AQ-FY for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:43 -0400 Received: from mail-ot1-x336.google.com ([2607:f8b0:4864:20::336]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn7dR-00050n-TS for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:43 -0400 Received: by mail-ot1-x336.google.com with SMTP id 46e09a7af769-710da8668b3so479554a34.1 for ; Sat, 07 Sep 2024 19:26:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725762401; x=1726367201; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/GYQH+qlsiZNusd91SlPcZXeCA8M1xc1mlbTAMFEsWI=; b=Gb+ABANnyglRwCEUKdp1iR5BZKFJJW40P6+kTmoyAG4mpkNY6+6cThqutRiZ/ar1dt c7ILOiTyiUHdBPjOY9GS8rcyrnZoYaX3zTYcduwpecjnTJ0pLVQUEgtizLU8tl2CqXk9 hTQjUtBx3r84tLvXHhf5L0z/iGVk3kOL6Lgn6LqI9KL8ZQOpn5y0lUNNwD8os0duUi+c H+DPXVANrc/gjZ94gCcJgq9GRjS6xR1R2Z7LTajmPIRle4M5iI1xrYOj3x+SU2IOe+yO vKv+jIv27V4f2agrnnlHaXNs2lETLsSTZ7JxxbQQAFksfb0TSkTzCd/KnfIPw5WfGHf3 93hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725762401; x=1726367201; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/GYQH+qlsiZNusd91SlPcZXeCA8M1xc1mlbTAMFEsWI=; b=O7FaezAotExGCDzvUbJCgqMwk45Cijrq7HvmRLiGj/cNFxQumDcEl8blqpAiqnLEeF 51bvnlDt6urH22qTxAQSrRh7fJGohRoRMctnhK62LHhasUXZJ5Wm0JcKpwkl3Gyy39f5 ROF8Zbx2Y3efL5sv6Lq9ArGBZTPoTiwhib2RumiG6c76b8ZY85ut1PkKRBPcoEkCOUbE 19BE+RSz7Gf6SCL+mDhBJz/ufMIRcXTeQ0PTd8/hy2G9f0luxVNG6CLYNfTXZlKBCQCU WFy9ipnoiFmCrwARzWE8MIxtXARsi6X7YpDgE0LDpGdxEMS6D/dao1enEsR07k5NNt8B kHAw== X-Gm-Message-State: AOJu0YwvhIYtM0VfHJWBevTfbbSEfLThE6eC/RI29EEaB+hOP/AGEEWJ WC6IRLFLwV/E6p9PXoIR04weVpLRoxhGjMStxVMp11E1+wm3HxGAhRxFdJkYwCMVHDfL9EyUORi Y X-Google-Smtp-Source: AGHT+IFPV5ddEwzUHQevvpH5mWuwC7UH0ippviagH5gjYNuXdjVgVZppCmxZRGaBTa5khBTUA6Kygg== X-Received: by 2002:a05:6830:4411:b0:708:d860:e51b with SMTP id 46e09a7af769-710cc22c384mr8034581a34.15.1725762400534; Sat, 07 Sep 2024 19:26:40 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dadbfe46d4sm4084019a91.1.2024.09.07.19.26.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2024 19:26:40 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, liwei1518@gmail.com, bmeng.cn@gmail.com Subject: [PATCH 06/12] tcg/optimize: Fold movcond with true and false values identical Date: Sat, 7 Sep 2024 19:26:26 -0700 Message-ID: <20240908022632.459477-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240908022632.459477-1-richard.henderson@linaro.org> References: <20240908022632.459477-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::336; envelope-from=richard.henderson@linaro.org; helo=mail-ot1-x336.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Fold "x = cond ? y : y" to "x = y". Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/optimize.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/tcg/optimize.c b/tcg/optimize.c index ba16ec27e2..cf311790e0 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -1851,6 +1851,11 @@ static bool fold_movcond(OptContext *ctx, TCGOp *op) { int i; + /* If true and false values are the same, eliminate the cmp. */ + if (args_are_copies(op->args[3], op->args[4])) { + return tcg_opt_gen_mov(ctx, op, op->args[0], op->args[3]); + } + /* * Canonicalize the "false" input reg to match the destination reg so * that the tcg backend can implement a "move if true" operation. From patchwork Sun Sep 8 02:26:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13795315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F0C2DECE577 for ; Sun, 8 Sep 2024 02:28:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn7dW-0003Pv-QX; Sat, 07 Sep 2024 22:26:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn7dU-0003GG-QB for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:44 -0400 Received: from mail-pj1-x102c.google.com ([2607:f8b0:4864:20::102c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn7dT-00051L-A6 for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:44 -0400 Received: by mail-pj1-x102c.google.com with SMTP id 98e67ed59e1d1-2da4ea59658so2358911a91.0 for ; Sat, 07 Sep 2024 19:26:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725762402; x=1726367202; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tk0wGAYaA1BFCH8Bi5mZsOhd6+SwTodI8yFR1/zj/XA=; b=rbW7uDKULEpLP6dC06hrDi0HpQ81gLwoLSWkjtetpgxdmt1DscLfXomdHdUUOpgwfH uMZGSrWrV9Frayz8Pzp29GUY9ZRqwciIFe3fO5a03VmrgwUK9zp1Sg2sC/muIHZJ1Qfy 4K0eFCQ6Ls5fm/O5xAjW+3hpCfpaApgyxdElTSXfzIHYamnXBvFituCLjDj93b8V9nmk /eUeEv9ZQFeii5bdNLtY+xVKEymenwUhLbPeKHpldwP0oMF0C4Hulz7NaUaUlksdVrgZ e7BuSnajTCCwJTVCbqtHkV+km3hTxBcTDxrhWo9tuqKrwJiN1ORNH+Gsjmiabwh1nstJ teWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725762402; x=1726367202; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tk0wGAYaA1BFCH8Bi5mZsOhd6+SwTodI8yFR1/zj/XA=; b=YwA7KSQi5QTS0zXyCG4zdcuFJuh5S6cx2J7Mj/mRL1Dj2r9jMKWWSKHX8hyNFczmcB 9QWLLm2d2O3v7YILECll9M7f3z/83v8b8YvbL8R+NTwRo3EBhNspOzfK9mm0FZS58a2j IvfcV4FOznVWpw4rhZg0F3dPxlbajAFyDn5oL+DNlbcVChzbjoSgpbNCI4V2lWAy18pj up8CfZBHvFmmFqU5btZWvdbFHGJ6e/nPtEdSonaYvVjYKG39jhnOaVCA+Ct3fCGPEBa2 UhLnDR2wIx6QgEClOC2IrCRB0Vf2zXY43oiOGhHmO+0oTlAzmreSjhFOnzJK4azE0dpN x4hA== X-Gm-Message-State: AOJu0YzKj36WTP63K4muhUvqAd2GN5oRyhpuLmXD3yxeGGhV9+fOdFot bod1dA8YbcRJO8owFT3R2es2mcl4xWUr8ZNH1C+MABY8OztxS4x1Lp3lYUteA1WuAY2+6fH20zC 7 X-Google-Smtp-Source: AGHT+IHuXrKEZmk4TmwHjWsaZ7xwLfaUeqxpeG8iHvFUHwUiJQDQposs4PvIWELoMupL25ZCASBalQ== X-Received: by 2002:a17:90a:ee87:b0:2c8:858:7035 with SMTP id 98e67ed59e1d1-2dad50cbcb1mr7287239a91.25.1725762401715; Sat, 07 Sep 2024 19:26:41 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dadbfe46d4sm4084019a91.1.2024.09.07.19.26.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2024 19:26:41 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, liwei1518@gmail.com, bmeng.cn@gmail.com Subject: [PATCH 07/12] tcg/optimize: Optimize cmp_vec and cmpsel_vec Date: Sat, 7 Sep 2024 19:26:27 -0700 Message-ID: <20240908022632.459477-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240908022632.459477-1-richard.henderson@linaro.org> References: <20240908022632.459477-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102c; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Place immediate values second in the comparison. Place destination matches first in the true/false values. All of this mirrors what we do for integer setcond and movcond. Signed-off-by: Richard Henderson --- tcg/optimize.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/tcg/optimize.c b/tcg/optimize.c index cf311790e0..f11f576fd4 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -2422,6 +2422,36 @@ static bool fold_setcond2(OptContext *ctx, TCGOp *op) return tcg_opt_gen_movi(ctx, op, op->args[0], i); } +static bool fold_cmp_vec(OptContext *ctx, TCGOp *op) +{ + /* Canonicalize the comparison to put immediate second. */ + if (swap_commutative(NO_DEST, &op->args[1], &op->args[2])) { + op->args[3] = tcg_swap_cond(op->args[3]); + } + return false; +} + +static bool fold_cmpsel_vec(OptContext *ctx, TCGOp *op) +{ + /* If true and false values are the same, eliminate the cmp. */ + if (args_are_copies(op->args[3], op->args[4])) { + return tcg_opt_gen_mov(ctx, op, op->args[0], op->args[3]); + } + + /* Canonicalize the comparison to put immediate second. */ + if (swap_commutative(NO_DEST, &op->args[1], &op->args[2])) { + op->args[5] = tcg_swap_cond(op->args[5]); + } + /* + * Canonicalize the "false" input reg to match the destination, + * so that the tcg backend can implement "move if true". + */ + if (swap_commutative(op->args[0], &op->args[4], &op->args[3])) { + op->args[5] = tcg_invert_cond(op->args[5]); + } + return false; +} + static bool fold_sextract(OptContext *ctx, TCGOp *op) { uint64_t z_mask, s_mask, s_mask_old; @@ -2928,6 +2958,12 @@ void tcg_optimize(TCGContext *s) case INDEX_op_setcond2_i32: done = fold_setcond2(&ctx, op); break; + case INDEX_op_cmp_vec: + done = fold_cmp_vec(&ctx, op); + break; + case INDEX_op_cmpsel_vec: + done = fold_cmpsel_vec(&ctx, op); + break; CASE_OP_32_64(sextract): done = fold_sextract(&ctx, op); break; From patchwork Sun Sep 8 02:26:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13795307 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ECD2CECE577 for ; Sun, 8 Sep 2024 02:28:09 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn7dX-0003U1-MW; Sat, 07 Sep 2024 22:26:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn7dV-0003LX-RP for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:45 -0400 Received: from mail-pj1-x1030.google.com ([2607:f8b0:4864:20::1030]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn7dU-000520-AD for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:45 -0400 Received: by mail-pj1-x1030.google.com with SMTP id 98e67ed59e1d1-2d873dc644dso2407582a91.3 for ; Sat, 07 Sep 2024 19:26:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725762403; x=1726367203; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+zeN6h7OSkzIznAtGXkS2TM05Vzv6xYEK4cYNsvDB5Y=; b=fgxrxszay9VRn5E174A3JWPdCzhUbe6SpMnwMRLmxtGBuiPjzmruzAZGYbwqig0TSK ELPqGje4TE3rIt+qJ33coqFSjCt3XVM+QXs8THuMV9TRE76N72XR0NEJpgfuaMJzXvPD GJWzvVFzsZYmYQCisNPZn6727WdMfQjGyur5Suk4rObtyENKT2I/ALQi4Z3/HpoUFQlf QlKE+nwnu2auCvG4S/qzP01WtdDGeLbNSjXgKyWmsIb+Fa9HN62ry5nyXbs8QBLBPKQQ v7O8WBWMIEF/TBZV0mLH8pYc03eMJO4nKtbSij6FCKb2FdU5rve+CRRLKDG/x4ypRzP0 z7iQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725762403; x=1726367203; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+zeN6h7OSkzIznAtGXkS2TM05Vzv6xYEK4cYNsvDB5Y=; b=a/+CrexhNjvmg5YMWeIb0JCdN+fPcp3P6q02iG+SWZxFco7wS9IqAQfDK8Of6bjeG0 kCgqIBUHcHQJqKjdQbjorp8eJ/y9CxzbmJzSDwW16dGDDxnGqdTFuTimKvYn+nGGyaFb j8TWCFpurTBA8ojj9n+zCTt34QTlyRSlTjTK0d8q5u6g6CPMKs93fpMUO9rEjPt9CFYc gpZ5JlV/a9PNizKO0wWkcFSgHly3Pixfn9XCc3WTYEL8tjAvv8/k5VYyAdXZcc4py53z DXnp8KTea3H5XfbKlqPfZZHEnVAAZUsqNhlrLVo8akQmKw4BOFP3SJd0nnmmIZf++4yw Qrww== X-Gm-Message-State: AOJu0Yzy3/D+aNM9JNJYL/zW1NxJCuto35vqMwksj+8PaTZ8ewIADnsb /cUnIAEsdg6mQab1LqTHxLKBUkga9msXL9MjGEOMNxqmeaQJhUFZiPukIhU3Utcs4qATNGUjnOY b X-Google-Smtp-Source: AGHT+IHU4gZrp8G+R9jE7kfDXii3XHwlmfNyKPRaq86Q9mpD/aG7CPKQ1YUAfSU1k1Snp8PrcFRcbQ== X-Received: by 2002:a17:90a:8c8c:b0:2cb:e429:f525 with SMTP id 98e67ed59e1d1-2dafd09713amr4639854a91.33.1725762402625; Sat, 07 Sep 2024 19:26:42 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dadbfe46d4sm4084019a91.1.2024.09.07.19.26.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2024 19:26:42 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, liwei1518@gmail.com, bmeng.cn@gmail.com Subject: [PATCH 08/12] tcg/optimize: Optimize bitsel_vec Date: Sat, 7 Sep 2024 19:26:28 -0700 Message-ID: <20240908022632.459477-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240908022632.459477-1-richard.henderson@linaro.org> References: <20240908022632.459477-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1030; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1030.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Fold matching true/false operands. Fold true/false operands with 0/-1 to simpler logicals. Signed-off-by: Richard Henderson --- tcg/optimize.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/tcg/optimize.c b/tcg/optimize.c index f11f576fd4..e9ef16b3c6 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -2737,6 +2737,61 @@ static bool fold_xor(OptContext *ctx, TCGOp *op) return fold_masks(ctx, op); } +static bool fold_bitsel_vec(OptContext *ctx, TCGOp *op) +{ + /* If true and false values are the same, eliminate the cmp. */ + if (args_are_copies(op->args[2], op->args[3])) { + return tcg_opt_gen_mov(ctx, op, op->args[0], op->args[2]); + } + + if (arg_is_const(op->args[2]) && arg_is_const(op->args[3])) { + uint64_t tv = arg_info(op->args[2])->val; + uint64_t fv = arg_info(op->args[3])->val; + + if (tv == -1 && fv == 0) { + return tcg_opt_gen_mov(ctx, op, op->args[0], op->args[1]); + } + if (tv == 0 && fv == -1) { + if (TCG_TARGET_HAS_not_vec) { + op->opc = INDEX_op_not_vec; + return fold_not(ctx, op); + } else { + op->opc = INDEX_op_xor_vec; + op->args[2] = arg_new_constant(ctx, -1); + return fold_xor(ctx, op); + } + } + } + if (arg_is_const(op->args[2])) { + uint64_t tv = arg_info(op->args[2])->val; + if (tv == -1) { + op->opc = INDEX_op_or_vec; + op->args[2] = op->args[3]; + return fold_or(ctx, op); + } + if (tv == 0 && TCG_TARGET_HAS_andc_vec) { + op->opc = INDEX_op_andc_vec; + op->args[2] = op->args[1]; + op->args[1] = op->args[3]; + return fold_andc(ctx, op); + } + } + if (arg_is_const(op->args[3])) { + uint64_t fv = arg_info(op->args[3])->val; + if (fv == 0) { + op->opc = INDEX_op_and_vec; + return fold_and(ctx, op); + } + if (fv == -1 && TCG_TARGET_HAS_orc_vec) { + op->opc = INDEX_op_orc_vec; + op->args[2] = op->args[1]; + op->args[1] = op->args[3]; + return fold_orc(ctx, op); + } + } + return false; +} + /* Propagate constants and copies, fold constant expressions. */ void tcg_optimize(TCGContext *s) { @@ -2964,6 +3019,9 @@ void tcg_optimize(TCGContext *s) case INDEX_op_cmpsel_vec: done = fold_cmpsel_vec(&ctx, op); break; + case INDEX_op_bitsel_vec: + done = fold_bitsel_vec(&ctx, op); + break; CASE_OP_32_64(sextract): done = fold_sextract(&ctx, op); break; From patchwork Sun Sep 8 02:26:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13795310 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A5F45ECE577 for ; Sun, 8 Sep 2024 02:28:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn7dZ-0003aP-7g; Sat, 07 Sep 2024 22:26:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn7dX-0003R4-1b for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:47 -0400 Received: from mail-pj1-x1030.google.com ([2607:f8b0:4864:20::1030]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn7dV-00052X-Fw for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:46 -0400 Received: by mail-pj1-x1030.google.com with SMTP id 98e67ed59e1d1-2d8a7c50607so2195143a91.1 for ; Sat, 07 Sep 2024 19:26:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725762404; x=1726367204; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=IPQ3Iuu/LAmgLlUZXWZGZIzBg2biU/xcO4whzf7Rmmg=; b=lHxtc0Lhyjea3wX5WTWZFzwz+eUo4PMaZSxZIBhmCAZOdxLCogbeHgRfVNPClF+nDF utiYWcC58SzAyroMVr8Xn5xmNwlf84HLt/5oFH0ck6wfEP3UVNeq7F3pLOPzGfbWJvgE 5e0DNTI6DdNZKo1bOLPTXzRzEhQqLgfAipiuOU9BIv++Q7XKoqhIZpTnSSow46PwibJ2 LSQGhl1ToDXJdv+YUAKRRzHjzc7iofw5hnq/iTLNSrqoKiz5JfbKNWQT8mE0dbJPAdeQ P94d7OatxumK/xQ4o0SZpfrqQEjVJuzM6i7eLzDQNnwKvj8UbB4vg7matbryuYb1KLLb knBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725762404; x=1726367204; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IPQ3Iuu/LAmgLlUZXWZGZIzBg2biU/xcO4whzf7Rmmg=; b=bMZsYFcAboAt+mFDWNq8J3Uk34kOxhG3tnLxpoteYd12A01kUrM/b9vO1opYSyDyc9 e/ifdU+R2ztvFjsbgaybsGrITdyrZRNayblDZtDPqlHOZciG1PPw+Y24CPEioNlirqiA JjYr+CZblbG0O4bNRmudvtUA2Z0PSg6aHHxZnmalv9eT8nOR6xWgJVAJvpIZuKl2+IQ1 1C6drBEQfjEzc2qDOpHaWSAWxLT3G9caaXJ3vKezax4d1qHZSAL8cM6UYmQ3uS95DK94 6ZVc6ojzvMm87QInL0SUs1LHkALgPuTN0SHp7ZJdnr1UxSpIYYiQ4oizGmUpuSki3iNd ofAw== X-Gm-Message-State: AOJu0YzhGYmXvyWjf3w2WTyjXs06QpiEC8CG1w4/cdn/oEG/GUfZyZ7v 9Sf1z/vSVAIq2A+GcefVqApE6b+5eBTaJPfkIhDblEtT1aZqw2v0vPnyvblTyIQIW7zFbiztbyK g X-Google-Smtp-Source: AGHT+IGBMUF8+1spcbwix1npOokfkEHzRs+BF18asiWY4gbuwbckJJbZT4hfWLtTMgXNGVIp2cozgw== X-Received: by 2002:a17:90b:4a0b:b0:2d3:d414:4511 with SMTP id 98e67ed59e1d1-2dad5022bc3mr9002725a91.24.1725762404137; Sat, 07 Sep 2024 19:26:44 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dadbfe46d4sm4084019a91.1.2024.09.07.19.26.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2024 19:26:43 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, liwei1518@gmail.com, bmeng.cn@gmail.com Subject: [PATCH 09/12] tcg/i386: Optimize cmpsel with constant 0 arguments Date: Sat, 7 Sep 2024 19:26:29 -0700 Message-ID: <20240908022632.459477-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240908022632.459477-1-richard.henderson@linaro.org> References: <20240908022632.459477-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1030; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1030.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org These can be simplified to and/andc, avoiding the load of the zero into a register. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target-con-set.h | 2 +- tcg/i386/tcg-target-con-str.h | 1 + tcg/i386/tcg-target.c.inc | 26 +++++++++++++++++++++++--- 3 files changed, 25 insertions(+), 4 deletions(-) diff --git a/tcg/i386/tcg-target-con-set.h b/tcg/i386/tcg-target-con-set.h index da4411d96b..a9ff245c42 100644 --- a/tcg/i386/tcg-target-con-set.h +++ b/tcg/i386/tcg-target-con-set.h @@ -50,7 +50,7 @@ C_N1_I2(r, r, r) C_N1_I2(r, r, rW) C_O1_I3(x, 0, x, x) C_O1_I3(x, x, x, x) -C_O1_I4(x, x, x, x, x) +C_O1_I4(x, x, x, xO, xO) C_O1_I4(r, r, reT, r, 0) C_O1_I4(r, r, r, ri, ri) C_O2_I1(r, r, L) diff --git a/tcg/i386/tcg-target-con-str.h b/tcg/i386/tcg-target-con-str.h index cc22db227b..52142ab121 100644 --- a/tcg/i386/tcg-target-con-str.h +++ b/tcg/i386/tcg-target-con-str.h @@ -28,6 +28,7 @@ REGS('s', ALL_BYTEL_REGS & ~SOFTMMU_RESERVE_REGS) /* qemu_st8_i32 data */ */ CONST('e', TCG_CT_CONST_S32) CONST('I', TCG_CT_CONST_I32) +CONST('O', TCG_CT_CONST_ZERO) CONST('T', TCG_CT_CONST_TST) CONST('W', TCG_CT_CONST_WSZ) CONST('Z', TCG_CT_CONST_U32) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index a04dc7d270..c63c3faed8 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -133,6 +133,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot) #define TCG_CT_CONST_I32 0x400 #define TCG_CT_CONST_WSZ 0x800 #define TCG_CT_CONST_TST 0x1000 +#define TCG_CT_CONST_ZERO 0x2000 /* Registers used with L constraint, which are the first argument registers on x86_64, and two random call clobbered registers on @@ -226,6 +227,9 @@ static bool tcg_target_const_match(int64_t val, int ct, if ((ct & TCG_CT_CONST_WSZ) && val == (type == TCG_TYPE_I32 ? 32 : 64)) { return 1; } + if ((ct & TCG_CT_CONST_ZERO) && val == 0) { + return 1; + } return 0; } @@ -3119,13 +3123,29 @@ static void tcg_out_cmpsel_vec(TCGContext *s, TCGType type, unsigned vece, TCGReg v0, TCGReg c1, TCGReg c2, TCGReg v3, TCGReg v4, TCGCond cond) { + /* + * Since XMM0 is 16, the only way we get 0 into V3 and V4 + * is via the constant zero constraint. + */ + if (!v3 && !v4) { + tcg_out_dupi_vec(s, type, vece, v0, 0); + return; + } + if (tcg_out_cmp_vec_noinv(s, type, vece, TCG_TMP_VEC, c1, c2, cond)) { TCGReg swap = v3; v3 = v4; v4 = swap; } - tcg_out_vex_modrm_type(s, OPC_VPBLENDVB, v0, v4, v3, type); - tcg_out8(s, (TCG_TMP_VEC - TCG_REG_XMM0) << 4); + + if (!v3) { + tcg_out_vex_modrm_type(s, OPC_PANDN, v0, TCG_TMP_VEC, v4, type); + } else if (!v4) { + tcg_out_vex_modrm_type(s, OPC_PAND, v0, TCG_TMP_VEC, v3, type); + } else { + tcg_out_vex_modrm_type(s, OPC_VPBLENDVB, v0, v4, v3, type); + tcg_out8(s, (TCG_TMP_VEC - TCG_REG_XMM0) << 4); + } } static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, @@ -3716,7 +3736,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) case INDEX_op_bitsel_vec: return C_O1_I3(x, x, x, x); case INDEX_op_cmpsel_vec: - return C_O1_I4(x, x, x, x, x); + return C_O1_I4(x, x, x, xO, xO); default: g_assert_not_reached(); From patchwork Sun Sep 8 02:26:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13795312 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BC19AECE57B for ; Sun, 8 Sep 2024 02:28:24 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn7da-0003fc-Gm; Sat, 07 Sep 2024 22:26:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn7dY-0003ZD-VL for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:48 -0400 Received: from mail-pj1-x1035.google.com ([2607:f8b0:4864:20::1035]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn7dX-000533-6I for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:48 -0400 Received: by mail-pj1-x1035.google.com with SMTP id 98e67ed59e1d1-2d87f34a650so2200908a91.1 for ; Sat, 07 Sep 2024 19:26:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725762405; x=1726367205; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pIGeU99kdWDiGBR1KxE6USqh+y8O9jN60zcDlRwe3e8=; b=oZt8LoTxyxCms6BNI1DsNTDdUz1tE2HztUlbF7ZAhU0QutJJi/+xlFFY6qv7CoQelZ +GFzcv4zLDG3Y9BZFdfrvDaiUpsf3fnV6/UStHGeGPxaSbPp5Eg4DgvOgfCIVMzvq/Jr FsykJlNBPorJTEIDObuDxERD6WtLJ0mDtdpcfPBHePO0SOpAWw2R7SJ+r3mPQlx1Tyrx 48c+fWbXQY9hnigoEuMSXSKAEdNTvBInHIR+7WdMrPcKWB/zWlkzeApQoEGCcim8ZAL3 14qTyd+D4/4yAV+QWNwDJfnsuDe0B+FgzzSd/0Cq/9kFOUHOBT3tdEYIAZ5WnoF0k4Jf Oudw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725762405; x=1726367205; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pIGeU99kdWDiGBR1KxE6USqh+y8O9jN60zcDlRwe3e8=; b=L7E6cMmHa+IV+yWaC+6Ol0JSQa7ZVKeHDjS8pdS6oQIgC9SHRtuA0eQ5Ybfj0Edlut 9chbVM11kkRbrsFNKX80tZVwazTNTWh+/DBWVrBaqrutEyUz6jhLZIUqOl4Q0+DvlfGQ MHknS3of7b3Pdh7iBMfmq/0446rE2EYnx5v5MJNS4vIjdJmhkcJHEdq0j7kYbxRvia3f dNr/qH1a6rFX5MNqK3F8p7m60F7sJ8xSYqI/jO6nDa312Lp8AbV5AZWAcufJN/KP9Lh1 MDYr4WfA8CtqgZeqKR4ci8Lh9RwKmvan3kNAcUiXShzMGKHxL5BiDDYVPAhIrYEjk5A6 +kzQ== X-Gm-Message-State: AOJu0Yxv9Fod/b6xpeMZ2Icuo6f7BskXKm8ZUtbeesF71bn+2aQ3dlNl HrMF0sVMtL/Up/hDEHhJhxYS1nazZWMP95b8ME2e6JTxh02Jj+jpwSNef75NBW+GMgL8/dlzYZb m X-Google-Smtp-Source: AGHT+IEvOOFA+fL2q+Z/JXJP/ZVVW0sv1v+INMjkuvbwp/v7Cpjkv+Dp+uFPszi+KNqCmqkqwbkmTQ== X-Received: by 2002:a17:90a:c296:b0:2da:88b3:cff8 with SMTP id 98e67ed59e1d1-2dad50281b4mr11244110a91.6.1725762405078; Sat, 07 Sep 2024 19:26:45 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dadbfe46d4sm4084019a91.1.2024.09.07.19.26.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2024 19:26:44 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, liwei1518@gmail.com, bmeng.cn@gmail.com Subject: [PATCH 10/12] tcg/i386: Implement cmp_vec with avx512 insns Date: Sat, 7 Sep 2024 19:26:30 -0700 Message-ID: <20240908022632.459477-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240908022632.459477-1-richard.henderson@linaro.org> References: <20240908022632.459477-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1035; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1035.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org The sse/avx instruction set only has EQ and GT as direct comparisons. Other signed comparisons can be generated from swapping and inversion. However unsigned comparisons are not available and must be transformed to signed comparisons by biasing the inputs. The avx512 instruction set has a complete set of comparisons, with results placed into a predicate register. We can produce the normal cmp_vec result by using VPMOVM2*. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 64 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 63 insertions(+), 1 deletion(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index c63c3faed8..839384885b 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -413,6 +413,14 @@ static bool tcg_target_const_match(int64_t val, int ct, #define OPC_UD2 (0x0b | P_EXT) #define OPC_VPBLENDD (0x02 | P_EXT3A | P_DATA16) #define OPC_VPBLENDVB (0x4c | P_EXT3A | P_DATA16) +#define OPC_VPCMPB (0x3f | P_EXT3A | P_DATA16 | P_EVEX) +#define OPC_VPCMPUB (0x3e | P_EXT3A | P_DATA16 | P_EVEX) +#define OPC_VPCMPW (0x3f | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) +#define OPC_VPCMPUW (0x3e | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) +#define OPC_VPCMPD (0x1f | P_EXT3A | P_DATA16 | P_EVEX) +#define OPC_VPCMPUD (0x1e | P_EXT3A | P_DATA16 | P_EVEX) +#define OPC_VPCMPQ (0x1f | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) +#define OPC_VPCMPUQ (0x1e | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) #define OPC_VPINSRB (0x20 | P_EXT3A | P_DATA16) #define OPC_VPINSRW (0xc4 | P_EXT | P_DATA16) #define OPC_VBROADCASTSS (0x18 | P_EXT38 | P_DATA16) @@ -421,6 +429,10 @@ static bool tcg_target_const_match(int64_t val, int ct, #define OPC_VPBROADCASTW (0x79 | P_EXT38 | P_DATA16) #define OPC_VPBROADCASTD (0x58 | P_EXT38 | P_DATA16) #define OPC_VPBROADCASTQ (0x59 | P_EXT38 | P_DATA16) +#define OPC_VPMOVM2B (0x28 | P_EXT38 | P_SIMDF3 | P_EVEX) +#define OPC_VPMOVM2W (0x28 | P_EXT38 | P_SIMDF3 | P_VEXW | P_EVEX) +#define OPC_VPMOVM2D (0x38 | P_EXT38 | P_SIMDF3 | P_EVEX) +#define OPC_VPMOVM2Q (0x38 | P_EXT38 | P_SIMDF3 | P_VEXW | P_EVEX) #define OPC_VPERMQ (0x00 | P_EXT3A | P_DATA16 | P_VEXW) #define OPC_VPERM2I128 (0x46 | P_EXT3A | P_DATA16 | P_VEXL) #define OPC_VPROLVD (0x15 | P_EXT38 | P_DATA16 | P_EVEX) @@ -3110,9 +3122,59 @@ static bool tcg_out_cmp_vec_noinv(TCGContext *s, TCGType type, unsigned vece, return fixup & NEED_INV; } +static void tcg_out_cmp_vec_k1(TCGContext *s, TCGType type, unsigned vece, + TCGReg v1, TCGReg v2, TCGCond cond) +{ + static const int cmpm_insn[2][4] = { + { OPC_VPCMPB, OPC_VPCMPW, OPC_VPCMPD, OPC_VPCMPQ }, + { OPC_VPCMPUB, OPC_VPCMPUW, OPC_VPCMPUD, OPC_VPCMPUQ } + }; + static const int cond_ext[16] = { + [TCG_COND_EQ] = 0, + [TCG_COND_NE] = 4, + [TCG_COND_LT] = 1, + [TCG_COND_LTU] = 1, + [TCG_COND_LE] = 2, + [TCG_COND_LEU] = 2, + [TCG_COND_NEVER] = 3, + [TCG_COND_GE] = 5, + [TCG_COND_GEU] = 5, + [TCG_COND_GT] = 6, + [TCG_COND_GTU] = 6, + [TCG_COND_ALWAYS] = 7, + }; + + tcg_out_vex_modrm_type(s, cmpm_insn[is_unsigned_cond(cond)][vece], + /* k1 */ 1, v1, v2, type); + tcg_out8(s, cond_ext[cond]); +} + +static void tcg_out_k1_to_vec(TCGContext *s, TCGType type, + unsigned vece, TCGReg dest) +{ + static const int movm_insn[] = { + OPC_VPMOVM2B, OPC_VPMOVM2W, OPC_VPMOVM2D, OPC_VPMOVM2Q + }; + tcg_out_vex_modrm_type(s, movm_insn[vece], dest, 0, /* k1 */ 1, type); +} + static void tcg_out_cmp_vec(TCGContext *s, TCGType type, unsigned vece, TCGReg v0, TCGReg v1, TCGReg v2, TCGCond cond) { + /* + * With avx512, we have a complete set of comparisons into mask. + * Unless there's a single insn expansion for the comparision, + * expand via a mask in k1. + */ + if ((vece <= MO_16 ? have_avx512bw : have_avx512dq) + && cond != TCG_COND_EQ + && cond != TCG_COND_LT + && cond != TCG_COND_GT) { + tcg_out_cmp_vec_k1(s, type, vece, v1, v2, cond); + tcg_out_k1_to_vec(s, type, vece, v0); + return; + } + if (tcg_out_cmp_vec_noinv(s, type, vece, v0, v1, v2, cond)) { tcg_out_dupi_vec(s, type, vece, TCG_TMP_VEC, -1); tcg_out_vex_modrm_type(s, OPC_PXOR, v0, v0, TCG_TMP_VEC, type); @@ -4080,7 +4142,7 @@ static TCGCond expand_vec_cond(TCGType type, unsigned vece, * We must bias the inputs so that they become signed. * All other swapping and inversion are handled during code generation. */ - if (vece == MO_64 && is_unsigned_cond(cond)) { + if (vece == MO_64 && !have_avx512dq && is_unsigned_cond(cond)) { TCGv_vec v1 = temp_tcgv_vec(arg_temp(*a1)); TCGv_vec v2 = temp_tcgv_vec(arg_temp(*a2)); TCGv_vec t1 = tcg_temp_new_vec(type); From patchwork Sun Sep 8 02:26:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13795311 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B3CFCECE577 for ; Sun, 8 Sep 2024 02:28:24 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn7da-0003fI-Dn; Sat, 07 Sep 2024 22:26:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn7dY-0003Z3-UO for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:48 -0400 Received: from mail-pj1-x102c.google.com ([2607:f8b0:4864:20::102c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn7dX-00053D-Hh for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:48 -0400 Received: by mail-pj1-x102c.google.com with SMTP id 98e67ed59e1d1-2d88c0f8e79so2548009a91.3 for ; Sat, 07 Sep 2024 19:26:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725762406; x=1726367206; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kWk3ZOhTNIHcodblpY60Bu/bdWzkAkFjFWEaQMye1xc=; b=y1GqD8ewkU8svjFKjFFXHQvlbwMd5GFSbS/Nmojcj1NEZwBgmNk13DjPTbQx6W3w5t t5rGn4Dt2swSXgIXt1sPr/yY+1+/Xv6L8hb69vmRHNrlyIPnKwWfchJ17Pz09n0QWAjE 2ohByaNpCU3B5nksn4ATSfUjxTFOTtv9Rs/ZoOWxTgQRav77KnC4jeL2TZGcmEHI4CRr 8iSnaIa82hGKIMIaPA/rSA/xnFt19K43gbECQg6wfD1ViFDvBRnqzIR+cQ4ZARHa+SZi 2YV0kG6Sql5ZJqCh4E8KFy5xI2PzeFu7MKQrZvffmPWpebMJQXzHqpOEIgR4z0HGMXC9 9VDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725762406; x=1726367206; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kWk3ZOhTNIHcodblpY60Bu/bdWzkAkFjFWEaQMye1xc=; b=aHpL2sekS94MEXYLT+i+MWvEDUlGVgRXAvCaLnDCyT9X672WuwugVqAdF7Zkzr2dU4 88NBAOQ5UjRRo6B87VOsMrTpM37fbev7hfvpe4vLgBOZ3WfmMfmCUn/om/0LsqhLw2cN X9BG1R5c8FpcJGbQlHRZqhQUlcIIyl6q+y0Ieqvyb9o8ylSQNmwhoQS/HcMLAlZDcF+h RB5QxGtQuzmUC3QrelZPlGbx/5BMY/4CTvYwmrwsxbV8qV8dXx1ULsJ/nBv+IxuFmgpz flQ/C7KjMLoGVlSjn5AOUi2ESFGaXXw08SZRV6ASpzjA5eMoFHmjJ3sanXgzLXRaKNHv AFtg== X-Gm-Message-State: AOJu0YyOX9s9p5KJtfLyXP18IWGRvJaKvsng2Zn91jM4Eae8mNN1XsqO hHwFW0qHIA+/fMTKHLJoQK/sBv7rgSEunmMgnJ/VAtEmfedJfwI1wxxNYvcFk833GAnXXhFoPuk P X-Google-Smtp-Source: AGHT+IH9Ke0y8ADDVleX4YZLLhJKEG93+ZkxsVhb0yKnkUNeAIv2wkc8bdvSFbDSP9ttTVr9CeL64g== X-Received: by 2002:a17:90a:8ce:b0:2d9:dd4a:6a95 with SMTP id 98e67ed59e1d1-2dad50d14c6mr8341590a91.25.1725762406167; Sat, 07 Sep 2024 19:26:46 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dadbfe46d4sm4084019a91.1.2024.09.07.19.26.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2024 19:26:45 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, liwei1518@gmail.com, bmeng.cn@gmail.com Subject: [PATCH 11/12] tcg/i386: Add predicate parameters to tcg_out_evex_opc Date: Sat, 7 Sep 2024 19:26:31 -0700 Message-ID: <20240908022632.459477-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240908022632.459477-1-richard.henderson@linaro.org> References: <20240908022632.459477-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102c; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Extend tcg_out_evex_opc to handle the predicate and zero-merging parameters of the evex prefix. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/i386/tcg-target.c.inc | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 839384885b..2a3ae28e85 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -674,7 +674,7 @@ static void tcg_out_vex_opc(TCGContext *s, int opc, int r, int v, } static void tcg_out_evex_opc(TCGContext *s, int opc, int r, int v, - int rm, int index) + int rm, int index, int aaa, bool z) { /* The entire 4-byte evex prefix; with R' and V' set. */ uint32_t p = 0x08041062; @@ -711,7 +711,9 @@ static void tcg_out_evex_opc(TCGContext *s, int opc, int r, int v, p = deposit32(p, 16, 2, pp); p = deposit32(p, 19, 4, ~v); p = deposit32(p, 23, 1, (opc & P_VEXW) != 0); + p = deposit32(p, 24, 3, aaa); p = deposit32(p, 29, 2, (opc & P_VEXL) != 0); + p = deposit32(p, 31, 1, z); tcg_out32(s, p); tcg_out8(s, opc); @@ -720,7 +722,7 @@ static void tcg_out_evex_opc(TCGContext *s, int opc, int r, int v, static void tcg_out_vex_modrm(TCGContext *s, int opc, int r, int v, int rm) { if (opc & P_EVEX) { - tcg_out_evex_opc(s, opc, r, v, rm, 0); + tcg_out_evex_opc(s, opc, r, v, rm, 0, 0, false); } else { tcg_out_vex_opc(s, opc, r, v, rm, 0); } From patchwork Sun Sep 8 02:26:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13795305 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 754D8ECE577 for ; Sun, 8 Sep 2024 02:27:53 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn7db-0003l5-Qe; Sat, 07 Sep 2024 22:26:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn7da-0003fa-F9 for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:50 -0400 Received: from mail-pj1-x1036.google.com ([2607:f8b0:4864:20::1036]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn7dY-00054Q-K4 for qemu-devel@nongnu.org; Sat, 07 Sep 2024 22:26:50 -0400 Received: by mail-pj1-x1036.google.com with SMTP id 98e67ed59e1d1-2d89dbb60bdso2270112a91.1 for ; Sat, 07 Sep 2024 19:26:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1725762407; x=1726367207; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bQ8ocbxlttrmmofg9gOzgPVyAWJvSO1BPV+Tn9L8tD4=; b=vUDhhs0gye/MX6X7LH/h5TNY6rXFHa4d8dpe9vIO4YgY/ZCsG/1DVMykHsKTnFhcnp zws0uAZsiBlV9lGsjRCzUqTsKIU1b1MZl7KNC5gdsHAJt9/ILCFbydU8RCFwducNIvZ3 hrWHsO2g5fwECHkF4lble+dwISB0L0hNfexMzBwRSd6xg5dLjob7WQ270k7r82hrf+Wp BbUrha+/8g8nLWMd5a2b220qI3pB9fLt6ZN8HQrfFzAa75uJyBjF/nXjnt/AkLGieZy8 y9Vc/np4M13Ud+1uATEsoVrmJp5Y7Gqgx4ebR2A1L8tUBvfr012IqvjHvZ8x3g7sttd2 4DfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725762407; x=1726367207; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bQ8ocbxlttrmmofg9gOzgPVyAWJvSO1BPV+Tn9L8tD4=; b=Lkw+nemTNMZX6DUiR4jwVowBNcU+qOl4/SoRqzojcY9wjo5iFC5Tov5QU2lL2zETZl Mp1dwgPV+v4wP3fhYRLUiZHc50Vlo7vMkRWfztlN6yItm+vxzR2+sTknQKQKXeTX15lL NDkZbQpquxiL0Y+4qYw4SAI6uAvZ4OcBIqViHdjQz6GgsPo6LpW2yo5ItIMVmeQ7as79 duqyj8D8FtryOkPswG4qQEBfyupklml3+k44o62Suc1Af95/3NqQFmMEqXXAptfzF5vq KKMnh6v4ppTDSnAZrXaDU360+RyxSIgP/UnFoqtLf2W+O3EivLT+9NmQPAQYuoqHwkTF l+xg== X-Gm-Message-State: AOJu0Yy7T17yMQLHCnCfJv8fBJf3Xto5ZUpCDKE6W6ac0fkV8hPOxQEm 47HwCwXFtvnjIMhq5+AKJbXrXYO5I5Ru5tWIp3O2fqz5ajyarX5PX7SAC1QLG1MLDookv+KbcDj J X-Google-Smtp-Source: AGHT+IHt9lbe1ukWLjXQsrMwgsj9jxgvFcW1h4I0syWKgSSgt1ZokKs/2cSBLcvhg7mrp1g0fbdiiA== X-Received: by 2002:a17:90b:4c41:b0:2d8:f11e:f7e with SMTP id 98e67ed59e1d1-2daffa7d5b8mr4432655a91.12.1725762407046; Sat, 07 Sep 2024 19:26:47 -0700 (PDT) Received: from stoup.. (174-21-81-121.tukw.qwest.net. [174.21.81.121]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dadbfe46d4sm4084019a91.1.2024.09.07.19.26.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2024 19:26:46 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: zhiwei_liu@linux.alibaba.com, tangtiancheng.ttc@alibaba-inc.com, liwei1518@gmail.com, bmeng.cn@gmail.com Subject: [PATCH 12/12] tcg/i386: Implement cmpsel_vec with avx512 insns Date: Sat, 7 Sep 2024 19:26:32 -0700 Message-ID: <20240908022632.459477-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240908022632.459477-1-richard.henderson@linaro.org> References: <20240908022632.459477-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1036; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1036.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org The avx512 vpblendm* instructions exactly implement cmpsel, using a predicate input. Of course this matches nicely with the avx512 predicate comparison instructions. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 46 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 2a3ae28e85..8c363b7bfc 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -413,6 +413,10 @@ static bool tcg_target_const_match(int64_t val, int ct, #define OPC_UD2 (0x0b | P_EXT) #define OPC_VPBLENDD (0x02 | P_EXT3A | P_DATA16) #define OPC_VPBLENDVB (0x4c | P_EXT3A | P_DATA16) +#define OPC_VPBLENDMB (0x66 | P_EXT38 | P_DATA16 | P_EVEX) +#define OPC_VPBLENDMW (0x66 | P_EXT38 | P_DATA16 | P_VEXW | P_EVEX) +#define OPC_VPBLENDMD (0x64 | P_EXT38 | P_DATA16 | P_EVEX) +#define OPC_VPBLENDMQ (0x64 | P_EXT38 | P_DATA16 | P_VEXW | P_EVEX) #define OPC_VPCMPB (0x3f | P_EXT3A | P_DATA16 | P_EVEX) #define OPC_VPCMPUB (0x3e | P_EXT3A | P_DATA16 | P_EVEX) #define OPC_VPCMPW (0x3f | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) @@ -738,6 +742,16 @@ static void tcg_out_vex_modrm_type(TCGContext *s, int opc, tcg_out_vex_modrm(s, opc, r, v, rm); } +static void tcg_out_evex_modrm_type(TCGContext *s, int opc, int r, int v, + int rm, int aaa, bool z, TCGType type) +{ + if (type == TCG_TYPE_V256) { + opc |= P_VEXL; + } + tcg_out_evex_opc(s, opc, r, v, rm, 0, aaa, z); + tcg_out8(s, 0xc0 | (LOWREGMASK(r) << 3) | LOWREGMASK(rm)); +} + /* Output an opcode with a full "rm + (index<