From patchwork Sat Nov 24 23:55:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 10696627 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 362E313AD for ; Sat, 24 Nov 2018 23:57:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 247B8299D9 for ; Sat, 24 Nov 2018 23:57:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 10CD529ADA; Sat, 24 Nov 2018 23:57:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A32E7299D9 for ; Sat, 24 Nov 2018 23:57:49 +0000 (UTC) Received: from localhost ([::1]:58149 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQho4-0004c7-Qf for patchwork-qemu-devel@patchwork.kernel.org; Sat, 24 Nov 2018 18:57:48 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57529) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQhmX-0003Gb-FT for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQhmV-0005eN-SE for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:13 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:38477) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQhmU-0005Gj-5i for qemu-devel@nongnu.org; Sat, 24 Nov 2018 18:56:10 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id AE6D5D0E; Sat, 24 Nov 2018 18:56:03 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sat, 24 Nov 2018 18:56:04 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= mesmtp; bh=x2LwubhBihsDRrY+VvsdA6mYNJwh4zyXxBSPKEN6EGM=; b=vSW8S PDasdMqpJn16e/u8dRrmcJsMiZlWDJAqgwxzyk7jVbsycfV2ZIqpJ00e/ZbYR4FY ilxlgeWMFBqtKyYxgOkkcuziOf20zb1uKs8Yq2XtaSp9u4MrtS52TBpdAyP7ZTPS uEN64CJDaHuc1l9yTDVV7zo+kP2Kb+VxVZt1Qk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=x2LwubhBihsDRrY+VvsdA6mYNJwh4 zyXxBSPKEN6EGM=; b=tS4iZy6NtkOoe5XrmpQFC3Dicqy7znH5q9xVeoonRdiHd p4F3znqSJT66xDApJGTfB6al7CE4U3hEdU/tstkNtv3Y8prJayVxSXn/7M07IWIk AWDXtMND/jlIXvY844Di0BT2NsOc2sWRO+00sm7QhakYd39gsRnqw2YOmTXVMww6 uJs6GhcBGYjbiWuk7mOdAxQ0v7x0wQoy17napi4dX5D2CGqsrvYAtTbZPSlXuto5 b7A+XA3qlZ2Cy0XR13L3g3VbL/RXf+JbDqJJKLtwFYPk4NpeEEQq2R3Ucp7/A/Cg kmJAuL93ixtBIY8Upojc47Z1CB4U6olf9TqLt4avQ== X-ME-Sender: X-ME-Proxy: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id E29A7102E4; Sat, 24 Nov 2018 18:56:02 -0500 (EST) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Sat, 24 Nov 2018 18:55:52 -0500 Message-Id: <20181124235553.17371-13-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181124235553.17371-1-cota@braap.org> References: <20181124235553.17371-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 Subject: [Qemu-devel] [PATCH v6 12/13] hardfloat: implement float32/64 square root X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard Henderson , =?utf-8?q?Alex_Benn?= =?utf-8?q?=C3=A9e?= Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Performance results for fp-bench: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: sqrt-single: 42.30 MFlops sqrt-double: 22.97 MFlops - after: sqrt-single: 311.42 MFlops sqrt-double: 311.08 MFlops Here USE_FP makes a huge difference for f64's, with throughput going from ~200 MFlops to ~300 MFlops. Signed-off-by: Emilio G. Cota Reviewed-by: Alex Bennée --- fpu/softfloat.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 58 insertions(+), 2 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index e03feafb6f..4c6ecd1883 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -3040,20 +3040,76 @@ float16 QEMU_FLATTEN float16_sqrt(float16 a, float_status *status) return float16_round_pack_canonical(pr, status); } -float32 QEMU_FLATTEN float32_sqrt(float32 a, float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_f32_sqrt(float32 a, float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pr = sqrt_float(pa, status, &float32_params); return float32_round_pack_canonical(pr, status); } -float64 QEMU_FLATTEN float64_sqrt(float64 a, float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_f64_sqrt(float64 a, float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pr = sqrt_float(pa, status, &float64_params); return float64_round_pack_canonical(pr, status); } +float32 QEMU_FLATTEN float32_sqrt(float32 xa, float_status *s) +{ + union_float32 ua, ur; + + ua.s = xa; + if (unlikely(!can_use_fpu(s))) { + goto soft; + } + + float32_input_flush1(&ua.s, s); + if (QEMU_HARDFLOAT_1F32_USE_FP) { + if (unlikely(!(fpclassify(ua.h) == FP_NORMAL || + fpclassify(ua.h) == FP_ZERO) || + signbit(ua.h))) { + goto soft; + } + } else if (unlikely(!float32_is_zero_or_normal(ua.s) || + float32_is_neg(ua.s))) { + goto soft; + } + ur.h = sqrtf(ua.h); + return ur.s; + + soft: + return soft_f32_sqrt(ua.s, s); +} + +float64 QEMU_FLATTEN float64_sqrt(float64 xa, float_status *s) +{ + union_float64 ua, ur; + + ua.s = xa; + if (unlikely(!can_use_fpu(s))) { + goto soft; + } + + float64_input_flush1(&ua.s, s); + if (QEMU_HARDFLOAT_1F64_USE_FP) { + if (unlikely(!(fpclassify(ua.h) == FP_NORMAL || + fpclassify(ua.h) == FP_ZERO) || + signbit(ua.h))) { + goto soft; + } + } else if (unlikely(!float64_is_zero_or_normal(ua.s) || + float64_is_neg(ua.s))) { + goto soft; + } + ur.h = sqrt(ua.h); + return ur.s; + + soft: + return soft_f64_sqrt(ua.s, s); +} + /*---------------------------------------------------------------------------- | The pattern for a default generated NaN. *----------------------------------------------------------------------------*/