From patchwork Sat Aug 11 17:30:19 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Steven Fuerst X-Patchwork-Id: 1311681 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by patchwork2.kernel.org (Postfix) with ESMTP id 34BDDDF223 for ; Mon, 13 Aug 2012 10:15:28 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 10BBC9F054 for ; Mon, 13 Aug 2012 03:15:28 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-pb0-f49.google.com (mail-pb0-f49.google.com [209.85.160.49]) by gabe.freedesktop.org (Postfix) with ESMTP id 90E5F9E839 for ; Sat, 11 Aug 2012 10:30:49 -0700 (PDT) Received: by pbbrq8 with SMTP id rq8so3434428pbb.36 for ; Sat, 11 Aug 2012 10:30:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:x-mailer; bh=hRVPnsJ2gk1Kj2enyYNAClsn210NA1hWfJA+/LkuaWQ=; b=gZO6mvxtYOJ/P9HlkjAbXRQQndFla0aLUO+fVEQtG2riOxq0C8NX4mW8WOEAql9Hzr 3OdosUubBn4YuCZ7t+qYKH6WZP5iA1FbVc5W94jubuMEFLQq/tLemsqHh8yEVDLUpDQ8 Hu9dZCNuIRUjIpA0ONGqRrAU/m4y2mevDHJo3ia8Iu6faeM88tfLUQT7g10YvM2skb6l zX+ajHw3680EAgTwRCyeYdp+GPpVhXxfbIiVJ4k5EcrXwHZaFXn0PAzAZuiMIbOIs3as VOY2+673pUF0BZb/tTqyq/ZdMMVP2zj3imqQrLRq3Oc8JqNrdohtI3EHpSca/0I+z4Dn MYQg== Received: by 10.66.78.69 with SMTP id z5mr14770062paw.14.1344706249093; Sat, 11 Aug 2012 10:30:49 -0700 (PDT) Received: from localhost.localdomain (c-24-18-84-54.hsd1.wa.comcast.net. [24.18.84.54]) by mx.google.com with ESMTPS id pn4sm1729323pbb.50.2012.08.11.10.30.48 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 11 Aug 2012 10:30:48 -0700 (PDT) From: Steven Fuerst To: dri-devel@lists.freedesktop.org Subject: [Patch v2 1/4] Replace i2f() in r600_blit.c with an optimized version. Date: Sat, 11 Aug 2012 10:30:19 -0700 Message-Id: <1344706222-3018-1-git-send-email-svfuerst@gmail.com> X-Mailer: git-send-email 1.7.10.4 X-Mailman-Approved-At: Mon, 13 Aug 2012 03:12:13 -0700 Cc: Steven Fuerst X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Errors-To: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org We use __fls() to find the most significant bit. Using that, the loop can be avoided. A second trick is to use the behaviour of the rotate instructions to expand the range of the unsigned int to float conversion to the full 32 bits in a branchless way. The routine is now exact up to 2^24. Above that, we truncate which is equivalent to rounding towards zero. Signed-off-by: Steven Fuerst Reviewed-by: Michel Dänzer --- drivers/gpu/drm/radeon/r600_blit.c | 50 ++++++++++++++++++++---------------- 1 file changed, 28 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600_blit.c b/drivers/gpu/drm/radeon/r600_blit.c index 3c031a4..326a8da 100644 --- a/drivers/gpu/drm/radeon/r600_blit.c +++ b/drivers/gpu/drm/radeon/r600_blit.c @@ -489,29 +489,35 @@ set_default_state(drm_radeon_private_t *dev_priv) ADVANCE_RING(); } -static uint32_t i2f(uint32_t input) +/* 23 bits of float fractional data */ +#define I2F_FRAC_BITS 23 +#define I2F_MASK ((1 << I2F_FRAC_BITS) - 1) + +/* + * Converts unsigned integer into 32-bit IEEE floating point representation. + * Will be exact from 0 to 2^24. Above that, we round towards zero + * as the fractional bits will not fit in a float. (It would be better to + * round towards even as the fpu does, but that is slower.) + */ +static uint32_t i2f(uint32_t x) { - u32 result, i, exponent, fraction; - - if ((input & 0x3fff) == 0) - result = 0; /* 0 is a special case */ - else { - exponent = 140; /* exponent biased by 127; */ - fraction = (input & 0x3fff) << 10; /* cheat and only - handle numbers below 2^^15 */ - for (i = 0; i < 14; i++) { - if (fraction & 0x800000) - break; - else { - fraction = fraction << 1; /* keep - shifting left until top bit = 1 */ - exponent = exponent - 1; - } - } - result = exponent << 23 | (fraction & 0x7fffff); /* mask - off top bit; assumed 1 */ - } - return result; + uint32_t msb, exponent, fraction; + + /* Zero is special */ + if (!x) return 0; + + /* Get location of the most significant bit */ + msb = __fls(x); + + /* + * Use a rotate instead of a shift because that works both leftwards + * and rightwards due to the mod(32) behaviour. This means we don't + * need to check to see if we are above 2^24 or not. + */ + fraction = ror32(x, (msb - I2F_FRAC_BITS) & 0x1f) & I2F_MASK; + exponent = (127 + msb) << I2F_FRAC_BITS; + + return fraction + exponent; }