From patchwork Wed Aug 15 22:07:15 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Fuerst X-Patchwork-Id: 1329921 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by patchwork2.kernel.org (Postfix) with ESMTP id 32B49DF280 for ; Thu, 16 Aug 2012 06:30:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 186A09E752 for ; Wed, 15 Aug 2012 23:29:59 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-pb0-f49.google.com (mail-pb0-f49.google.com [209.85.160.49]) by gabe.freedesktop.org (Postfix) with ESMTP id 6A8329EB4A for ; Wed, 15 Aug 2012 15:07:31 -0700 (PDT) Received: by mail-pb0-f49.google.com with SMTP id rq8so465634pbb.36 for ; Wed, 15 Aug 2012 15:07:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:x-mailer:in-reply-to:references; bh=0cxX/a/XpGzW2dTNCWb1U6Yc24Ol7Kr4iigdyd5ilg0=; b=U/4Ex85y4OuUkaQ5JX1rBoQ5OsAJo7XSPG2ourXU9VcGQ4f0ce0yPvy95D0f8SG+4Q HdtW7vwP6nZZ3Zoa6zs/HHAC7HsAvPpTL4vl9vuYCSFMDcgIcCge4PEvyg+Sxe0ZQuow /JAsi8HJq1rSr7GJJB/BjwagFXhOqTe3JT9WdEZl1V+W4EUArTQOGsfHtKKtFXxyapTI vUHfoMPrVds05k4AUx1jLJ3fqIhxBqJw1lvVgQiANhgWWP/rAJ7S/N1xN4n+Cg96uGWG /Xt4FVo5a5aOMcA4a2n28ksK3sXRlTesMldbEihfTeO/Q8KFUFC8DJwbrSJwUO4ESsLL MakA== Received: by 10.66.75.228 with SMTP id f4mr26843474paw.52.1345068451253; Wed, 15 Aug 2012 15:07:31 -0700 (PDT) Received: from localhost.localdomain (c-24-18-84-54.hsd1.wa.comcast.net. [24.18.84.54]) by mx.google.com with ESMTPS id qn3sm1205977pbc.6.2012.08.15.15.07.30 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 15 Aug 2012 15:07:30 -0700 (PDT) From: Steven Fuerst To: dri-devel@lists.freedesktop.org Subject: [PATCH v3 2/3] Replace int2float() with an optimized version. Date: Wed, 15 Aug 2012 15:07:15 -0700 Message-Id: <1345068436-19885-2-git-send-email-svfuerst@gmail.com> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1345068436-19885-1-git-send-email-svfuerst@gmail.com> References: <1345068436-19885-1-git-send-email-svfuerst@gmail.com> X-Mailman-Approved-At: Wed, 15 Aug 2012 22:37:11 -0700 Cc: Steven Fuerst X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Errors-To: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org We use __fls() to find the most significant bit. Using that, the loop can be avoided. A second trick is to use the behaviour of the rotate instructions to expand the range of the unsigned int to float conversion to the full 32 bits in a branchless way. The routine is now exact up to 2^24. Above that, we truncate which is equivalent to rounding towards zero. Signed-off-by: Steven Fuerst --- drivers/gpu/drm/radeon/r600_blit.c | 51 ++++++++++++++++++++---------------- 1 file changed, 28 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600_blit.c b/drivers/gpu/drm/radeon/r600_blit.c index ee1b815..7d8ac42 100644 --- a/drivers/gpu/drm/radeon/r600_blit.c +++ b/drivers/gpu/drm/radeon/r600_blit.c @@ -489,31 +489,36 @@ set_default_state(drm_radeon_private_t *dev_priv) ADVANCE_RING(); } -uint32_t int2float(uint32_t input) +/* 23 bits of float fractional data */ +#define I2F_FRAC_BITS 23 +#define I2F_MASK ((1 << I2F_FRAC_BITS) - 1) + +/* + * Converts unsigned integer into 32-bit IEEE floating point representation. + * Will be exact from 0 to 2^24. Above that, we round towards zero + * as the fractional bits will not fit in a float. (It would be better to + * round towards even as the fpu does, but that is slower.) + */ +uint32_t int2float(uint32_t x) { - u32 result, i, exponent, fraction; - - if ((input & 0x3fff) == 0) - result = 0; /* 0 is a special case */ - else { - exponent = 140; /* exponent biased by 127; */ - fraction = (input & 0x3fff) << 10; /* cheat and only - handle numbers below 2^^15 */ - for (i = 0; i < 14; i++) { - if (fraction & 0x800000) - break; - else { - fraction = fraction << 1; /* keep - shifting left until top bit = 1 */ - exponent = exponent - 1; - } - } - result = exponent << 23 | (fraction & 0x7fffff); /* mask - off top bit; assumed 1 */ - } - return result; -} + uint32_t msb, exponent, fraction; + + /* Zero is special */ + if (!x) return 0; + + /* Get location of the most significant bit */ + msb = __fls(x); + /* + * Use a rotate instead of a shift because that works both leftwards + * and rightwards due to the mod(32) behaviour. This means we don't + * need to check to see if we are above 2^24 or not. + */ + fraction = ror32(x, (msb - I2F_FRAC_BITS) & 0x1f) & I2F_MASK; + exponent = (127 + msb) << I2F_FRAC_BITS; + + return fraction + exponent; +} static int r600_nomm_get_vb(struct drm_device *dev) {