From patchwork Thu Nov 7 15:17:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ville Syrjala X-Patchwork-Id: 11233149 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E74FF112B for ; Thu, 7 Nov 2019 15:17:36 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CFFF321882 for ; Thu, 7 Nov 2019 15:17:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CFFF321882 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B0CD56F6D8; Thu, 7 Nov 2019 15:17:34 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 086DE6E4A6; Thu, 7 Nov 2019 15:17:32 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Nov 2019 07:17:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,278,1569308400"; d="scan'208";a="213039732" Received: from stinkbox.fi.intel.com (HELO stinkbox) ([10.237.72.174]) by fmsmga001.fm.intel.com with SMTP; 07 Nov 2019 07:17:30 -0800 Received: by stinkbox (sSMTP sendmail emulation); Thu, 07 Nov 2019 17:17:29 +0200 From: Ville Syrjala To: intel-gfx@lists.freedesktop.org Subject: [PATCH 01/12] drm: Inline drm_color_lut_extract() Date: Thu, 7 Nov 2019 17:17:14 +0200 Message-Id: <20191107151725.10507-2-ville.syrjala@linux.intel.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191107151725.10507-1-ville.syrjala@linux.intel.com> References: <20191107151725.10507-1-ville.syrjala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Swati Sharma , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Ville Syrjälä This thing can get called several thousand times per LUT so seems like we want to inline it to: - avoid the function call overhead - allow constant folding A quick synthetic test (w/o any hardware interaction) with a ridiculously large LUT size shows about 50% reduction in runtime on my HSW and BSW boxes. Slightly less with more reasonable LUT size but still easily measurable in tens of microseconds. Signed-off-by: Ville Syrjälä Reviewed-by: Nicholas Kazlauskas --- drivers/gpu/drm/drm_color_mgmt.c | 24 ------------------------ include/drm/drm_color_mgmt.h | 23 ++++++++++++++++++++++- 2 files changed, 22 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/drm_color_mgmt.c b/drivers/gpu/drm/drm_color_mgmt.c index 4ce5c6d8de99..19c5f635992a 100644 --- a/drivers/gpu/drm/drm_color_mgmt.c +++ b/drivers/gpu/drm/drm_color_mgmt.c @@ -108,30 +108,6 @@ * standard enum values supported by the DRM plane. */ -/** - * drm_color_lut_extract - clamp and round LUT entries - * @user_input: input value - * @bit_precision: number of bits the hw LUT supports - * - * Extract a degamma/gamma LUT value provided by user (in the form of - * &drm_color_lut entries) and round it to the precision supported by the - * hardware. - */ -uint32_t drm_color_lut_extract(uint32_t user_input, uint32_t bit_precision) -{ - uint32_t val = user_input; - uint32_t max = 0xffff >> (16 - bit_precision); - - /* Round only if we're not using full precision. */ - if (bit_precision < 16) { - val += 1UL << (16 - bit_precision - 1); - val >>= 16 - bit_precision; - } - - return clamp_val(val, 0, max); -} -EXPORT_SYMBOL(drm_color_lut_extract); - /** * drm_crtc_enable_color_mgmt - enable color management properties * @crtc: DRM CRTC diff --git a/include/drm/drm_color_mgmt.h b/include/drm/drm_color_mgmt.h index d1c662d92ab7..069b21d61871 100644 --- a/include/drm/drm_color_mgmt.h +++ b/include/drm/drm_color_mgmt.h @@ -29,7 +29,28 @@ struct drm_crtc; struct drm_plane; -uint32_t drm_color_lut_extract(uint32_t user_input, uint32_t bit_precision); +/** + * drm_color_lut_extract - clamp and round LUT entries + * @user_input: input value + * @bit_precision: number of bits the hw LUT supports + * + * Extract a degamma/gamma LUT value provided by user (in the form of + * &drm_color_lut entries) and round it to the precision supported by the + * hardware. + */ +static inline u32 drm_color_lut_extract(u32 user_input, int bit_precision) +{ + u32 val = user_input; + u32 max = 0xffff >> (16 - bit_precision); + + /* Round only if we're not using full precision. */ + if (bit_precision < 16) { + val += 1UL << (16 - bit_precision - 1); + val >>= 16 - bit_precision; + } + + return clamp_val(val, 0, max); +} void drm_crtc_enable_color_mgmt(struct drm_crtc *crtc, uint degamma_lut_size,