Message ID | 20190919140620.32407-3-linux@rasmusvillemoes.dk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/5] backlight: pwm_bl: fix cie1913 comments and constant | expand |
On Thu, Sep 19, 2019 at 04:06:18PM +0200, Rasmus Villemoes wrote: > The scheduler uses a (currently private) fixed_power_int() in its load > average computation for computing powers of numbers 0 < x < 1 > expressed as fixed-point numbers, which is also what we want here. But > that requires the scale to be a power-of-2. It feels like there is some rationale missing in the description here. What is the benefit of replacing the explicit int_pow() with the implicit multiplications? Daniel. > > We could (and a following patch will) change to use a power-of-2 scale, > but for a fixed small exponent of 3, there's no advantage in using > repeated squaring. > > Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> > --- > drivers/video/backlight/pwm_bl.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c > index 9252d51f31b9..aee6839e024a 100644 > --- a/drivers/video/backlight/pwm_bl.c > +++ b/drivers/video/backlight/pwm_bl.c > @@ -179,7 +179,8 @@ static u64 cie1931(unsigned int lightness, unsigned int scale) > if (lightness <= (8 * scale)) { > retval = DIV_ROUND_CLOSEST(lightness * 10, 9033); > } else { > - retval = int_pow((lightness + (16 * scale)) / 116, 3); > + retval = (lightness + (16 * scale)) / 116; > + retval *= retval * retval; > retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale)); > } > > -- > 2.20.1
On 07/10/2019 17.28, Daniel Thompson wrote: > On Thu, Sep 19, 2019 at 04:06:18PM +0200, Rasmus Villemoes wrote: > > It feels like there is some rationale missing in the description here. > > What is the benefit of replacing the explicit int_pow() with the > implicit multiplications? > > > Daniel. > > >> >> We could (and a following patch will) change to use a power-of-2 scale, >> but for a fixed small exponent of 3, there's no advantage in using >> repeated squaring. ^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Apart from the function call overhead (and resulting register pressure etc.), using int_pow is less efficient (for an exponent of 3, it ends up doing four 64x64 multiplications instead of just two). But feel free to drop it, I'm not going to pursue it further - it just seemed like a sensible thing to do while I was optimizing the code anyway. [At the time I wrote the patch, this was also the only user of int_pow in the tree, so it also allowed removing int_pow altogether.] Rasmus
On Mon, Oct 07, 2019 at 08:43:31PM +0200, Rasmus Villemoes wrote: > On 07/10/2019 17.28, Daniel Thompson wrote: > > On Thu, Sep 19, 2019 at 04:06:18PM +0200, Rasmus Villemoes wrote: > > > > It feels like there is some rationale missing in the description here. > > > > What is the benefit of replacing the explicit int_pow() with the > > implicit multiplications? > > > > > > Daniel. > > > > > >> > >> We could (and a following patch will) change to use a power-of-2 scale, > >> but for a fixed small exponent of 3, there's no advantage in using > >> repeated squaring. > > ^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Apart from the function call overhead (and resulting register pressure > etc.), using int_pow is less efficient (for an exponent of 3, it ends up > doing four 64x64 multiplications instead of just two). But feel free to > drop it, I'm not going to pursue it further - it just seemed like a > sensible thing to do while I was optimizing the code anyway. > > [At the time I wrote the patch, this was also the only user of int_pow > in the tree, so it also allowed removing int_pow altogether.] To be honest the change is fine but the patch description doesn't make sense if the only current purpose of the patch is as a optimization. Daniel.
On 08/10/2019 11.31, Daniel Thompson wrote: > On Mon, Oct 07, 2019 at 08:43:31PM +0200, Rasmus Villemoes wrote: >> On 07/10/2019 17.28, Daniel Thompson wrote: >>> On Thu, Sep 19, 2019 at 04:06:18PM +0200, Rasmus Villemoes wrote: >>> >>> It feels like there is some rationale missing in the description here. >>> >> >> Apart from the function call overhead (and resulting register pressure >> etc.), using int_pow is less efficient (for an exponent of 3, it ends up >> doing four 64x64 multiplications instead of just two). But feel free to >> drop it, I'm not going to pursue it further - it just seemed like a >> sensible thing to do while I was optimizing the code anyway. >> >> [At the time I wrote the patch, this was also the only user of int_pow >> in the tree, so it also allowed removing int_pow altogether.] > > To be honest the change is fine but the patch description doesn't make > sense if the only current purpose of the patch is as a optimization. Agreed. Do you want me to resend the series with patch 3 updated to read "For a fixed small exponent of 3, it is more efficient to simply use two explicit multiplications rather than calling the int_pow() library function: Aside from the function call overhead, its implementation using repeated squaring means it ends up doing four 64x64 multiplications." (and obviously patch 5 dropped)? Rasmus
On Tue, Oct 08, 2019 at 12:02:07PM +0200, Rasmus Villemoes wrote: > On 08/10/2019 11.31, Daniel Thompson wrote: > > On Mon, Oct 07, 2019 at 08:43:31PM +0200, Rasmus Villemoes wrote: > >> On 07/10/2019 17.28, Daniel Thompson wrote: > >>> On Thu, Sep 19, 2019 at 04:06:18PM +0200, Rasmus Villemoes wrote: > >>> > >>> It feels like there is some rationale missing in the description here. > >>> > >> > >> Apart from the function call overhead (and resulting register pressure > >> etc.), using int_pow is less efficient (for an exponent of 3, it ends up > >> doing four 64x64 multiplications instead of just two). But feel free to > >> drop it, I'm not going to pursue it further - it just seemed like a > >> sensible thing to do while I was optimizing the code anyway. > >> > >> [At the time I wrote the patch, this was also the only user of int_pow > >> in the tree, so it also allowed removing int_pow altogether.] > > > > To be honest the change is fine but the patch description doesn't make > > sense if the only current purpose of the patch is as a optimization. > > Agreed. Do you want me to resend the series with patch 3 updated to read > > "For a fixed small exponent of 3, it is more efficient to simply use two > explicit multiplications rather than calling the int_pow() library > function: Aside from the function call overhead, its implementation > using repeated squaring means it ends up doing four 64x64 multiplications." > > (and obviously patch 5 dropped)? Yes, please. When you resend you can add my R-B: to all patches: Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Daniel. PS Don't mind either way but I wondered the following is clearer than the slightly funky multiply-and-assign expression (which isn't wrong but isn't very common either so my brain won't speed read it): retval = DIV_ROUND_CLOSEST_ULL(retval * retval * retval, scale * scale);
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c index 9252d51f31b9..aee6839e024a 100644 --- a/drivers/video/backlight/pwm_bl.c +++ b/drivers/video/backlight/pwm_bl.c @@ -179,7 +179,8 @@ static u64 cie1931(unsigned int lightness, unsigned int scale) if (lightness <= (8 * scale)) { retval = DIV_ROUND_CLOSEST(lightness * 10, 9033); } else { - retval = int_pow((lightness + (16 * scale)) / 116, 3); + retval = (lightness + (16 * scale)) / 116; + retval *= retval * retval; retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale)); }
The scheduler uses a (currently private) fixed_power_int() in its load average computation for computing powers of numbers 0 < x < 1 expressed as fixed-point numbers, which is also what we want here. But that requires the scale to be a power-of-2. We could (and a following patch will) change to use a power-of-2 scale, but for a fixed small exponent of 3, there's no advantage in using repeated squaring. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> --- drivers/video/backlight/pwm_bl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)