[1/2] drm/i915: Use mul_u32_u32() more

Message ID	20190408152702.4153-1-ville.syrjala@linux.intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Ville Syrjala <ville.syrjala@linux.intel.com> To: intel-gfx@lists.freedesktop.org Date: Mon, 8 Apr 2019 18:27:01 +0300 Message-Id: <20190408152702.4153-1-ville.syrjala@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/2] drm/i915: Use mul_u32_u32() more Precedence: list Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	[1/2] drm/i915: Use mul_u32_u32() more \| expand [1/2] drm/i915: Use mul_u32_u32() more [2/2] drm/i915: Simplify some icl pll calculations

Message ID

20190408152702.4153-1-ville.syrjala@linux.intel.com (mailing list archive)

State

New, archived

Headers

From: Ville Syrjala <ville.syrjala@linux.intel.com>
To: intel-gfx@lists.freedesktop.org
Date: Mon,  8 Apr 2019 18:27:01 +0300
Message-Id: <20190408152702.4153-1-ville.syrjala@linux.intel.com>
MIME-Version: 1.0
Subject: [Intel-gfx] [PATCH 1/2] drm/i915: Use mul_u32_u32() more
Precedence: list
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Series

[1/2] drm/i915: Use mul_u32_u32() more | expand

Commit Message

Ville Syrjälä April 8, 2019, 3:27 p.m. UTC

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

We have a lot of '(u64)foo * bar' everywhere. Replace with
mul_u32_u32() to avoid gcc failing to use a regular 32x32->64
multiply for this.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_fixed.h     |  6 +++---
 drivers/gpu/drm/i915/intel_display.c  | 10 +++++-----
 drivers/gpu/drm/i915/intel_dpll_mgr.c |  4 ++--
 drivers/gpu/drm/i915/intel_pm.c       |  2 +-
 4 files changed, 11 insertions(+), 11 deletions(-)

Comments

Chris Wilson April 8, 2019, 3:44 p.m. UTC | #1

Quoting Ville Syrjala (2019-04-08 16:27:01)
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> We have a lot of '(u64)foo * bar' everywhere. Replace with
> mul_u32_u32() to avoid gcc failing to use a regular 32x32->64
> multiply for this.
> 
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

As a purely mechanical translation,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

> diff --git a/drivers/gpu/drm/i915/intel_dpll_mgr.c b/drivers/gpu/drm/i915/intel_dpll_mgr.c
> index e01c057ce50b..29edc369920b 100644
> --- a/drivers/gpu/drm/i915/intel_dpll_mgr.c
> +++ b/drivers/gpu/drm/i915/intel_dpll_mgr.c
> @@ -2741,11 +2741,11 @@ static bool icl_calc_mg_pll_state(struct intel_crtc_state *crtc_state)
>         }
>  
>         if (use_ssc) {
> -               tmp = (u64)dco_khz * 47 * 32;
> +               tmp = mul_u32_u32(dco_khz, 47 * 32);
>                 do_div(tmp, refclk_khz * m1div * 10000);
>                 ssc_stepsize = tmp;
>  
> -               tmp = (u64)dco_khz * 1000;
> +               tmp = mul_u32_u32(dco_khz, 1000);

These caught my eye, wondering if the code was better reduced if the
constant was first or itself cast to (u64).
-Chris

Ville Syrjälä April 10, 2019, 6:24 p.m. UTC | #2

On Mon, Apr 08, 2019 at 04:44:04PM +0100, Chris Wilson wrote:
> Quoting Ville Syrjala (2019-04-08 16:27:01)
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > We have a lot of '(u64)foo * bar' everywhere. Replace with
> > mul_u32_u32() to avoid gcc failing to use a regular 32x32->64
> > multiply for this.
> > 
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> As a purely mechanical translation,
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> > diff --git a/drivers/gpu/drm/i915/intel_dpll_mgr.c b/drivers/gpu/drm/i915/intel_dpll_mgr.c
> > index e01c057ce50b..29edc369920b 100644
> > --- a/drivers/gpu/drm/i915/intel_dpll_mgr.c
> > +++ b/drivers/gpu/drm/i915/intel_dpll_mgr.c
> > @@ -2741,11 +2741,11 @@ static bool icl_calc_mg_pll_state(struct intel_crtc_state *crtc_state)
> >         }
> >  
> >         if (use_ssc) {
> > -               tmp = (u64)dco_khz * 47 * 32;
> > +               tmp = mul_u32_u32(dco_khz, 47 * 32);
> >                 do_div(tmp, refclk_khz * m1div * 10000);
> >                 ssc_stepsize = tmp;
> >  
> > -               tmp = (u64)dco_khz * 1000;
> > +               tmp = mul_u32_u32(dco_khz, 1000);
> 
> These caught my eye, wondering if the code was better reduced if the
> constant was first or itself cast to (u64).

Looks like gcc (8.2) handles these two as is actually. Or at least
the generated asm is identical both ways.

diff --git a/drivers/gpu/drm/i915/i915_fixed.h b/drivers/gpu/drm/i915/i915_fixed.h
index 591dd89ba7af..6621595fe74c 100644
--- a/drivers/gpu/drm/i915/i915_fixed.h
+++ b/drivers/gpu/drm/i915/i915_fixed.h
@@ -71,7 +71,7 @@  static inline u32 mul_round_up_u32_fixed16(u32 val, uint_fixed_16_16_t mul)
 {
 	u64 tmp;
 
-	tmp = (u64)val * mul.val;
+	tmp = mul_u32_u32(val, mul.val);
 	tmp = DIV_ROUND_UP_ULL(tmp, 1 << 16);
 	WARN_ON(tmp > U32_MAX);
 
@@ -83,7 +83,7 @@  static inline uint_fixed_16_16_t mul_fixed16(uint_fixed_16_16_t val,
 {
 	u64 tmp;
 
-	tmp = (u64)val.val * mul.val;
+	tmp = mul_u32_u32(val.val, mul.val);
 	tmp = tmp >> 16;
 
 	return clamp_u64_to_fixed16(tmp);
@@ -114,7 +114,7 @@  static inline uint_fixed_16_16_t mul_u32_fixed16(u32 val, uint_fixed_16_16_t mul
 {
 	u64 tmp;
 
-	tmp = (u64)val * mul.val;
+	tmp = mul_u32_u32(val, mul.val);
 
 	return clamp_u64_to_fixed16(tmp);
 }
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index cb7f99618f02..f10ea27d1fc8 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -550,7 +550,7 @@  int chv_calc_dpll_params(int refclk, struct dpll *clock)
 	clock->p = clock->p1 * clock->p2;
 	if (WARN_ON(clock->n == 0 || clock->p == 0))
 		return 0;
-	clock->vco = DIV_ROUND_CLOSEST_ULL((u64)refclk * clock->m,
+	clock->vco = DIV_ROUND_CLOSEST_ULL(mul_u32_u32(refclk, clock->m),
 					   clock->n << 22);
 	clock->dot = DIV_ROUND_CLOSEST(clock->vco, clock->p);
 
@@ -935,8 +935,8 @@  chv_find_best_dpll(const struct intel_limit *limit,
 
 			clock.p = clock.p1 * clock.p2;
 
-			m2 = DIV_ROUND_CLOSEST_ULL(((u64)target * clock.p *
-					clock.n) << 22, refclk * clock.m1);
+			m2 = DIV_ROUND_CLOSEST_ULL(mul_u32_u32(target, clock.p * clock.n) << 22,
+						   refclk * clock.m1);
 
 			if (m2 > INT_MAX/clock.m1)
 				continue;
@@ -6871,7 +6871,7 @@  static u32 ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config)
 		if (WARN_ON(!pfit_w || !pfit_h))
 			return pixel_rate;
 
-		pixel_rate = div_u64((u64)pixel_rate * pipe_w * pipe_h,
+		pixel_rate = div_u64(mul_u32_u32(pixel_rate, pipe_w * pipe_h),
 				     pfit_w * pfit_h);
 	}
 
@@ -6991,7 +6991,7 @@  static void compute_m_n(unsigned int m, unsigned int n,
 	else
 		*ret_n = min_t(unsigned int, roundup_pow_of_two(n), DATA_LINK_N_MAX);
 
-	*ret_m = div_u64((u64)m * *ret_n, n);
+	*ret_m = div_u64(mul_u32_u32(m, *ret_n), n);
 	intel_reduce_m_n_ratio(ret_m, ret_n);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_dpll_mgr.c b/drivers/gpu/drm/i915/intel_dpll_mgr.c
index e01c057ce50b..29edc369920b 100644
--- a/drivers/gpu/drm/i915/intel_dpll_mgr.c
+++ b/drivers/gpu/drm/i915/intel_dpll_mgr.c
@@ -2741,11 +2741,11 @@  static bool icl_calc_mg_pll_state(struct intel_crtc_state *crtc_state)
 	}
 
 	if (use_ssc) {
-		tmp = (u64)dco_khz * 47 * 32;
+		tmp = mul_u32_u32(dco_khz, 47 * 32);
 		do_div(tmp, refclk_khz * m1div * 10000);
 		ssc_stepsize = tmp;
 
-		tmp = (u64)dco_khz * 1000;
+		tmp = mul_u32_u32(dco_khz, 1000);
 		ssc_steplen = DIV_ROUND_UP_ULL(tmp, 32 * 2 * 32);
 	} else {
 		ssc_stepsize = 0;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index bba477e62a12..31c03673697e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -675,7 +675,7 @@  static unsigned int intel_wm_method1(unsigned int pixel_rate,
 {
 	u64 ret;
 
-	ret = (u64)pixel_rate * cpp * latency;
+	ret = mul_u32_u32(pixel_rate, cpp * latency);
 	ret = DIV_ROUND_UP_ULL(ret, 10000);
 
 	return ret;

[1/2] drm/i915: Use mul_u32_u32() more

Commit Message

Comments

Patch