From patchwork Thu Feb 13 07:52:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robert Hoo X-Patchwork-Id: 11379973 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B0379921 for ; Thu, 13 Feb 2020 07:54:44 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 918792168B for ; Thu, 13 Feb 2020 07:54:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 918792168B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:48508 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j29Kd-0004iw-Ql for patchwork-qemu-devel@patchwork.kernel.org; Thu, 13 Feb 2020 02:54:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:58482) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j29J8-0002WS-Ig for qemu-devel@nongnu.org; Thu, 13 Feb 2020 02:53:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j29J7-0001vO-C2 for qemu-devel@nongnu.org; Thu, 13 Feb 2020 02:53:10 -0500 Received: from mga17.intel.com ([192.55.52.151]:53030) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j29J7-0001u3-2i for qemu-devel@nongnu.org; Thu, 13 Feb 2020 02:53:09 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Feb 2020 23:53:05 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,436,1574150400"; d="scan'208";a="381032021" Received: from sqa-gate.sh.intel.com (HELO robert-ivt.tsp.org) ([10.239.48.212]) by orsmga004.jf.intel.com with ESMTP; 12 Feb 2020 23:53:03 -0800 From: Robert Hoo To: qemu-devel@nongnu.org, pbonzini@redhat.com, laurent@vivier.eu, philmd@redhat.com, berrange@redhat.com Subject: [PATCH 1/2] configure: add configure option avx512f_opt Date: Thu, 13 Feb 2020 15:52:58 +0800 Message-Id: <1581580379-54109-2-git-send-email-robert.hu@linux.intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1581580379-54109-1-git-send-email-robert.hu@linux.intel.com> References: <1581580379-54109-1-git-send-email-robert.hu@linux.intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.151 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: robert.hu@intel.com, Robert Hoo Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Like previous avx2_opt option, config-host.mak will have CONFIG_AVX512F_OPT defined if compiling host has the ability. AVX512F instruction set is available since Intel Skylake. More info: https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf Signed-off-by: Robert Hoo --- configure | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/configure b/configure index 115dc38..9bf8de0 100755 --- a/configure +++ b/configure @@ -1382,6 +1382,11 @@ for opt do ;; --enable-avx2) avx2_opt="yes" ;; + --disable-avx512f) avx512f_opt="no" + ;; + --enable-avx512f) avx512f_opt="yes" + ;; + --enable-glusterfs) glusterfs="yes" ;; --disable-virtio-blk-data-plane|--enable-virtio-blk-data-plane) @@ -1811,6 +1816,7 @@ disabled with --disable-FEATURE, default is enabled if available: tcmalloc tcmalloc support jemalloc jemalloc support avx2 AVX2 optimization support + avx512f AVX512F optimization support replication replication support opengl opengl support virglrenderer virgl rendering support @@ -5481,6 +5487,34 @@ EOF fi fi +########################################## +# avx512f optimization requirement check +# +# There is no point enabling this if cpuid.h is not usable, +# since we won't be able to select the new routines. + +if test "$cpuid_h" = "yes" && test "$avx512f_opt" != "no"; then + cat > $TMPC << EOF +#pragma GCC push_options +#pragma GCC target("avx512f") +#include +#include +static int bar(void *a) { + __m512i x = *(__m512i *)a; + return _mm512_test_epi64_mask(x, x); +} +int main(int argc, char *argv[]) +{ + return bar(argv[0]); +} +EOF + if compile_object "" ; then + avx512f_opt="yes" + else + avx512f_opt="no" + fi +fi + ######################################## # check if __[u]int128_t is usable. @@ -6605,6 +6639,7 @@ echo "libxml2 $libxml2" echo "tcmalloc support $tcmalloc" echo "jemalloc support $jemalloc" echo "avx2 optimization $avx2_opt" +echo "avx512f optimization $avx512f_opt" echo "replication support $replication" echo "VxHS block device $vxhs" echo "bochs support $bochs" @@ -7152,6 +7187,10 @@ if test "$avx2_opt" = "yes" ; then echo "CONFIG_AVX2_OPT=y" >> $config_host_mak fi +if test "$avx512f_opt" = "yes" ; then + echo "CONFIG_AVX512F_OPT=y" >> $config_host_mak +fi + if test "$lzo" = "yes" ; then echo "CONFIG_LZO=y" >> $config_host_mak fi From patchwork Thu Feb 13 07:52:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robert Hoo X-Patchwork-Id: 11379971 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F219E1805 for ; Thu, 13 Feb 2020 07:53:44 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D374E2168B for ; Thu, 13 Feb 2020 07:53:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D374E2168B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:48500 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j29Jf-0003QR-Ur for patchwork-qemu-devel@patchwork.kernel.org; Thu, 13 Feb 2020 02:53:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:58500) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j29J9-0002X2-DR for qemu-devel@nongnu.org; Thu, 13 Feb 2020 02:53:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j29J8-0001wA-8K for qemu-devel@nongnu.org; Thu, 13 Feb 2020 02:53:11 -0500 Received: from mga17.intel.com ([192.55.52.151]:53034) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j29J8-0001uK-0B for qemu-devel@nongnu.org; Thu, 13 Feb 2020 02:53:10 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Feb 2020 23:53:07 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,436,1574150400"; d="scan'208";a="381032029" Received: from sqa-gate.sh.intel.com (HELO robert-ivt.tsp.org) ([10.239.48.212]) by orsmga004.jf.intel.com with ESMTP; 12 Feb 2020 23:53:05 -0800 From: Robert Hoo To: qemu-devel@nongnu.org, pbonzini@redhat.com, laurent@vivier.eu, philmd@redhat.com, berrange@redhat.com Subject: [PATCH 2/2] util: add util function buffer_zero_avx512() Date: Thu, 13 Feb 2020 15:52:59 +0800 Message-Id: <1581580379-54109-3-git-send-email-robert.hu@linux.intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1581580379-54109-1-git-send-email-robert.hu@linux.intel.com> References: <1581580379-54109-1-git-send-email-robert.hu@linux.intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.151 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: robert.hu@intel.com, Robert Hoo Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" And initialize buffer_is_zero() with it, when Intel AVX512F is available on host. This function utilizes Intel AVX512 fundamental instructions which perform over previous AVX2 instructions. Signed-off-by: Robert Hoo --- include/qemu/cpuid.h | 3 +++ util/bufferiszero.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 56 insertions(+), 3 deletions(-) diff --git a/include/qemu/cpuid.h b/include/qemu/cpuid.h index 6930170..09fc245 100644 --- a/include/qemu/cpuid.h +++ b/include/qemu/cpuid.h @@ -45,6 +45,9 @@ #ifndef bit_AVX2 #define bit_AVX2 (1 << 5) #endif +#ifndef bit_AVX512F +#define bit_AVX512F (1 << 16) +#endif #ifndef bit_BMI2 #define bit_BMI2 (1 << 8) #endif diff --git a/util/bufferiszero.c b/util/bufferiszero.c index bfb2605..cbb854a 100644 --- a/util/bufferiszero.c +++ b/util/bufferiszero.c @@ -187,12 +187,54 @@ buffer_zero_avx2(const void *buf, size_t len) #pragma GCC pop_options #endif /* CONFIG_AVX2_OPT */ +#ifdef CONFIG_AVX512F_OPT +#pragma GCC push_options +#pragma GCC target("avx512f") +#include + +static bool +buffer_zero_avx512(const void *buf, size_t len) +{ + __m512i t; + __m512i *p, *e; + + if (unlikely(len < 64)) { /*buff less than 512 bits, unlikely*/ + return buffer_zero_int(buf, len); + } + /* Begin with an unaligned head of 64 bytes. */ + t = _mm512_loadu_si512(buf); + p = (__m512i *)(((uintptr_t)buf + 5 * 64) & -64); + e = (__m512i *)(((uintptr_t)buf + len) & -64); + + /* Loop over 64-byte aligned blocks of 256. */ + while (p < e) { + __builtin_prefetch(p); + if (unlikely(_mm512_test_epi64_mask(t, t))) { + return false; + } + t = p[-4] | p[-3] | p[-2] | p[-1]; + p += 4; + } + + t |= _mm512_loadu_si512(buf + len - 4 * 64); + t |= _mm512_loadu_si512(buf + len - 3 * 64); + t |= _mm512_loadu_si512(buf + len - 2 * 64); + t |= _mm512_loadu_si512(buf + len - 1 * 64); + + return !_mm512_test_epi64_mask(t, t); + +} +#pragma GCC pop_options +#endif + + /* Note that for test_buffer_is_zero_next_accel, the most preferred * ISA must have the least significant bit. */ -#define CACHE_AVX2 1 -#define CACHE_SSE4 2 -#define CACHE_SSE2 4 +#define CACHE_AVX512F 1 +#define CACHE_AVX2 2 +#define CACHE_SSE4 4 +#define CACHE_SSE2 6 /* Make sure that these variables are appropriately initialized when * SSE2 is enabled on the compiler command-line, but the compiler is @@ -226,6 +268,11 @@ static void init_accel(unsigned cache) fn = buffer_zero_avx2; } #endif +#ifdef CONFIG_AVX512F_OPT + if (cache & CACHE_AVX512F) { + fn = buffer_zero_avx512; + } +#endif buffer_accel = fn; } @@ -255,6 +302,9 @@ static void __attribute__((constructor)) init_cpuid_cache(void) if ((bv & 6) == 6 && (b & bit_AVX2)) { cache |= CACHE_AVX2; } + if ((bv & 6) == 6 && (b & bit_AVX512F)) { + cache |= CACHE_AVX512F; + } } } cpuid_cache = cache;