diff mbox series

iio: gts-helpers: Round gains and scales

Message ID ZUDN9n8iXoNwzifQ@dc78bmyyyyyyyyyyyyyyt-3.rev.dnainternet.fi (mailing list archive)
State Changes Requested
Headers show
Series iio: gts-helpers: Round gains and scales | expand

Commit Message

Matti Vaittinen Oct. 31, 2023, 9:50 a.m. UTC
The GTS helpers do flooring of scale when calculating available scales.
This results available-scales to be reported smaller than they should
when the division in scale computation resulted remainder greater than
half of the divider. (decimal part of result > 0.5)

Furthermore, when gains are computed based on scale, the gain resulting
from the scale computation is also floored. As a consequence the
floored scales reported by available scales may not match the gains that
can be set.

The related discussion can be found from:
https://lore.kernel.org/all/84d7c283-e8e5-4c98-835c-fe3f6ff94f4b@gmail.com/

Do rounding when computing scales and gains.

Fixes: 38416c28e168 ("iio: light: Add gain-time-scale helpers")
Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>

---
Subjahit, is there any chance you test this patch with your driver? Can
you drop the:
	if (val2 % 10)
		val2 += 1;
from scale setting and do you see written and read scales matching?

I did run a few Kunit tests on this change - but I'm still a bit jumpy
on it... Reviewing/testing is highly appreciated!

Just in case someone is interested in seeing the Kunit tests, they're
somewhat unpolished & crude and can emit noisy debug prints - but can
anyways be found from:
https://github.com/M-Vaittinen/linux/commits/iio-gts-helpers-test-v6.6

---
 drivers/iio/industrialio-gts-helper.c | 58 +++++++++++++++++++++++----
 1 file changed, 50 insertions(+), 8 deletions(-)


base-commit: ffc253263a1375a65fa6c9f62a893e9767fbebfa

Comments

Jonathan Cameron Nov. 26, 2023, 5:26 p.m. UTC | #1
On Tue, 31 Oct 2023 11:50:46 +0200
Matti Vaittinen <mazziesaccount@gmail.com> wrote:

> The GTS helpers do flooring of scale when calculating available scales.
> This results available-scales to be reported smaller than they should
> when the division in scale computation resulted remainder greater than
> half of the divider. (decimal part of result > 0.5)
> 
> Furthermore, when gains are computed based on scale, the gain resulting
> from the scale computation is also floored. As a consequence the
> floored scales reported by available scales may not match the gains that
> can be set.
> 
> The related discussion can be found from:
> https://lore.kernel.org/all/84d7c283-e8e5-4c98-835c-fe3f6ff94f4b@gmail.com/
> 
> Do rounding when computing scales and gains.
> 
> Fixes: 38416c28e168 ("iio: light: Add gain-time-scale helpers")
> Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>

Hi Matti,

A few questions inline about the maths.

> 
> ---
> Subjahit, is there any chance you test this patch with your driver? Can
> you drop the:
> 	if (val2 % 10)
> 		val2 += 1;
> from scale setting and do you see written and read scales matching?
> 
> I did run a few Kunit tests on this change - but I'm still a bit jumpy
> on it... Reviewing/testing is highly appreciated!
> 
> Just in case someone is interested in seeing the Kunit tests, they're
> somewhat unpolished & crude and can emit noisy debug prints - but can
> anyways be found from:
> https://github.com/M-Vaittinen/linux/commits/iio-gts-helpers-test-v6.6
> 
> ---
>  drivers/iio/industrialio-gts-helper.c | 58 +++++++++++++++++++++++----
>  1 file changed, 50 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/iio/industrialio-gts-helper.c b/drivers/iio/industrialio-gts-helper.c
> index 7653261d2dc2..7dc144ac10c8 100644
> --- a/drivers/iio/industrialio-gts-helper.c
> +++ b/drivers/iio/industrialio-gts-helper.c
> @@ -18,6 +18,32 @@
>  #include <linux/iio/iio-gts-helper.h>
>  #include <linux/iio/types.h>
>  
> +static int iio_gts_get_gain_32(u64 full, unsigned int scale)
> +{
> +	unsigned int full32 = (unsigned int) full;
> +	unsigned int rem;
> +	int result;
> +
> +	if (full == (u64)full32) {
> +		unsigned int rem;
> +
> +		result = full32 / scale;
> +		rem = full32 - scale * result;
> +		if (rem >= scale / 2)
> +			result++;
> +
> +		return result;
> +	}
> +
> +	rem = do_div(full, scale);

As below, can we just add scale/2 to full in the do_div?

> +	if ((u64)rem >= scale / 2)
> +		result = full + 1;
> +	else
> +		result = full;
> +
> +	return result;
> +}
> +
>  /**
>   * iio_gts_get_gain - Convert scale to total gain
>   *
> @@ -28,30 +54,42 @@
>   *		scale is 64 100 000 000.
>   * @scale:	Linearized scale to compute the gain for.
>   *
> - * Return:	(floored) gain corresponding to the scale. -EINVAL if scale
> + * Return:	(rounded) gain corresponding to the scale. -EINVAL if scale
>   *		is invalid.
>   */
>  static int iio_gts_get_gain(const u64 max, const u64 scale)
>  {
> -	u64 full = max;
> +	u64 full = max, half_div;
> +	unsigned int scale32 = (unsigned int) scale;
>  	int tmp = 1;
>  
> -	if (scale > full || !scale)
> +	if (scale / 2 > full || !scale)

Seems odd. Why are we checking scale / 2 here?

>  		return -EINVAL;
>  
> +	/*
> +	 * The loop-based implementation below will potentially run _long_
> +	 * if we have a small scale and large 'max' - which may be needed when
> +	 * GTS is used for channels returning specific units. Luckily we can
> +	 * avoid the loop when scale is small and fits in 32 bits.
> +	 */
> +	if ((u64)scale32 == scale)
> +		return iio_gts_get_gain_32(full, scale32);
> +
>  	if (U64_MAX - full < scale) {
>  		/* Risk of overflow */
> -		if (full - scale < scale)
> +		if (full - scale / 2 < scale)
>  			return 1;
>  
>  		full -= scale;
>  		tmp++;
>  	}
>  
> -	while (full > scale * (u64)tmp)
> +	half_div = scale >> 2;

Why divide by 4?  Looks like classic issue with using shifts for division
causing confusion.

> +
> +	while (full + half_div >= scale * (u64)tmp)
>  		tmp++;
>  
> -	return tmp;
> +	return tmp - 1;
>  }
>  
>  /**
> @@ -133,6 +171,7 @@ static int iio_gts_linearize(int scale_whole, int scale_nano,
>   * Convert the total gain value to scale. NOTE: This does not separate gain
>   * generated by HW-gain or integration time. It is up to caller to decide what
>   * part of the total gain is due to integration time and what due to HW-gain.
> + * Computed gain is rounded to nearest integer.
>   *
>   * Return: 0 on success. Negative errno on failure.
>   */
> @@ -140,10 +179,13 @@ int iio_gts_total_gain_to_scale(struct iio_gts *gts, int total_gain,
>  				int *scale_int, int *scale_nano)
>  {
>  	u64 tmp;
> +	int rem;
>  
>  	tmp = gts->max_scale;
>  
> -	do_div(tmp, total_gain);
> +	rem = do_div(tmp, total_gain);

can we do usual trick of
do_div(tmp + total_gain/2, total_gain)
to get the same rounding effect?

> +	if (total_gain > 1 && rem >= total_gain / 2)
> +		tmp += 1ULL;
>  
>  	return iio_gts_delinearize(tmp, NANO, scale_int, scale_nano);
>  }
> @@ -192,7 +234,7 @@ static int gain_to_scaletables(struct iio_gts *gts, int **gains, int **scales)
>  		sort(gains[i], gts->num_hwgain, sizeof(int), iio_gts_gain_cmp,
>  		     NULL);
>  
> -		/* Convert gains to scales */
> +		/* Convert gains to scales. */

Grumble - unrelated change.

>  		for (j = 0; j < gts->num_hwgain; j++) {
>  			ret = iio_gts_total_gain_to_scale(gts, gains[i][j],
>  							  &scales[i][2 * j],
> 
> base-commit: ffc253263a1375a65fa6c9f62a893e9767fbebfa
Matti Vaittinen Nov. 27, 2023, 7:48 a.m. UTC | #2
On 11/26/23 19:26, Jonathan Cameron wrote:
> On Tue, 31 Oct 2023 11:50:46 +0200
> Matti Vaittinen <mazziesaccount@gmail.com> wrote:
> 
>> The GTS helpers do flooring of scale when calculating available scales.
>> This results available-scales to be reported smaller than they should
>> when the division in scale computation resulted remainder greater than
>> half of the divider. (decimal part of result > 0.5)
>>
>> Furthermore, when gains are computed based on scale, the gain resulting
>> from the scale computation is also floored. As a consequence the
>> floored scales reported by available scales may not match the gains that
>> can be set.
>>
>> The related discussion can be found from:
>> https://lore.kernel.org/all/84d7c283-e8e5-4c98-835c-fe3f6ff94f4b@gmail.com/
>>
>> Do rounding when computing scales and gains.
>>
>> Fixes: 38416c28e168 ("iio: light: Add gain-time-scale helpers")
>> Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
> 
> Hi Matti,
> 
> A few questions inline about the maths.

I appreciate the questions :) Thanks!
> 
>>
>> ---
>> Subjahit, is there any chance you test this patch with your driver? Can
>> you drop the:
>> 	if (val2 % 10)
>> 		val2 += 1;
>> from scale setting and do you see written and read scales matching?
>>
>> I did run a few Kunit tests on this change - but I'm still a bit jumpy
>> on it... Reviewing/testing is highly appreciated!
>>
>> Just in case someone is interested in seeing the Kunit tests, they're
>> somewhat unpolished & crude and can emit noisy debug prints - but can
>> anyways be found from:
>> https://github.com/M-Vaittinen/linux/commits/iio-gts-helpers-test-v6.6
>>
>> ---
>>   drivers/iio/industrialio-gts-helper.c | 58 +++++++++++++++++++++++----
>>   1 file changed, 50 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/iio/industrialio-gts-helper.c b/drivers/iio/industrialio-gts-helper.c
>> index 7653261d2dc2..7dc144ac10c8 100644
>> --- a/drivers/iio/industrialio-gts-helper.c
>> +++ b/drivers/iio/industrialio-gts-helper.c
>> @@ -18,6 +18,32 @@
>>   #include <linux/iio/iio-gts-helper.h>
>>   #include <linux/iio/types.h>
>>   
>> +static int iio_gts_get_gain_32(u64 full, unsigned int scale)
>> +{
>> +	unsigned int full32 = (unsigned int) full;
>> +	unsigned int rem;
>> +	int result;
>> +
>> +	if (full == (u64)full32) {
>> +		unsigned int rem;
>> +
>> +		result = full32 / scale;
>> +		rem = full32 - scale * result;
>> +		if (rem >= scale / 2)
>> +			result++;
>> +
>> +		return result;
>> +	}
>> +
>> +	rem = do_div(full, scale);
> 
> As below, can we just add scale/2 to full in the do_div?

The rationale for doing is it in this way is to prevent (theoretical?) 
overflow when adding scale/2 to full. Maybe this warrants adding a comment?

> 
>> +	if ((u64)rem >= scale / 2)
>> +		result = full + 1;
>> +	else
>> +		result = full;
>> +
>> +	return result;
>> +}
>> +
>>   /**
>>    * iio_gts_get_gain - Convert scale to total gain
>>    *
>> @@ -28,30 +54,42 @@
>>    *		scale is 64 100 000 000.
>>    * @scale:	Linearized scale to compute the gain for.
>>    *
>> - * Return:	(floored) gain corresponding to the scale. -EINVAL if scale
>> + * Return:	(rounded) gain corresponding to the scale. -EINVAL if scale
>>    *		is invalid.
>>    */
>>   static int iio_gts_get_gain(const u64 max, const u64 scale)
>>   {
>> -	u64 full = max;
>> +	u64 full = max, half_div;
>> +	unsigned int scale32 = (unsigned int) scale;
>>   	int tmp = 1;
>>   
>> -	if (scale > full || !scale)
>> +	if (scale / 2 > full || !scale)
> 
> Seems odd. Why are we checking scale / 2 here?

I am pretty sure I have been thinking of rounding 0.5 to 1.

> 
>>   		return -EINVAL;
>>   
>> +	/*
>> +	 * The loop-based implementation below will potentially run _long_
>> +	 * if we have a small scale and large 'max' - which may be needed when
>> +	 * GTS is used for channels returning specific units. Luckily we can
>> +	 * avoid the loop when scale is small and fits in 32 bits.
>> +	 */
>> +	if ((u64)scale32 == scale)
>> +		return iio_gts_get_gain_32(full, scale32);
>> +
>>   	if (U64_MAX - full < scale) {
>>   		/* Risk of overflow */
>> -		if (full - scale < scale)
>> +		if (full - scale / 2 < scale)
>>   			return 1;
>>   
>>   		full -= scale;
>>   		tmp++;
>>   	}
>>   
>> -	while (full > scale * (u64)tmp)
>> +	half_div = scale >> 2;
> 
> Why divide by 4?  Looks like classic issue with using shifts for division
> causing confusion.

Yes. Looks like a brainfart to me. I need to fire-up my tests and revise 
this (and the check you asked about above). It seems to take a while 
from me to wrap my head around this again...

Thanks for pointing this out!

> 
>> +
>> +	while (full + half_div >= scale * (u64)tmp)
>>   		tmp++;
>>   
>> -	return tmp;
>> +	return tmp - 1;
>>   }
>>   
>>   /**
>> @@ -133,6 +171,7 @@ static int iio_gts_linearize(int scale_whole, int scale_nano,
>>    * Convert the total gain value to scale. NOTE: This does not separate gain
>>    * generated by HW-gain or integration time. It is up to caller to decide what
>>    * part of the total gain is due to integration time and what due to HW-gain.
>> + * Computed gain is rounded to nearest integer.
>>    *
>>    * Return: 0 on success. Negative errno on failure.
>>    */
>> @@ -140,10 +179,13 @@ int iio_gts_total_gain_to_scale(struct iio_gts *gts, int total_gain,
>>   				int *scale_int, int *scale_nano)
>>   {
>>   	u64 tmp;
>> +	int rem;
>>   
>>   	tmp = gts->max_scale;
>>   
>> -	do_div(tmp, total_gain);
>> +	rem = do_div(tmp, total_gain);
> 
> can we do usual trick of
> do_div(tmp + total_gain/2, total_gain)
> to get the same rounding effect?

Only if we don't care about the case where tmp + total_gain/2 overflows.

> 
>> +	if (total_gain > 1 && rem >= total_gain / 2)
>> +		tmp += 1ULL;
>>   
>>   	return iio_gts_delinearize(tmp, NANO, scale_int, scale_nano);
>>   }
>> @@ -192,7 +234,7 @@ static int gain_to_scaletables(struct iio_gts *gts, int **gains, int **scales)
>>   		sort(gains[i], gts->num_hwgain, sizeof(int), iio_gts_gain_cmp,
>>   		     NULL);
>>   
>> -		/* Convert gains to scales */
>> +		/* Convert gains to scales. */
> 
> Grumble - unrelated change.

Yes. I'll drop this.

> 
>>   		for (j = 0; j < gts->num_hwgain; j++) {
>>   			ret = iio_gts_total_gain_to_scale(gts, gains[i][j],
>>   							  &scales[i][2 * j],
>>
>> base-commit: ffc253263a1375a65fa6c9f62a893e9767fbebfa

All in all, I am still not 100% sure if rounding is the right ambition. 
Do we cause hidden accuracy issues by doing the rounding under the hood? 
I feel I need bigger brains :)

Yours,
	-- Matti
Matti Vaittinen Nov. 28, 2023, 11:56 a.m. UTC | #3
On 11/27/23 09:48, Matti Vaittinen wrote:
> On 11/26/23 19:26, Jonathan Cameron wrote:
>> On Tue, 31 Oct 2023 11:50:46 +0200
>> Matti Vaittinen <mazziesaccount@gmail.com> wrote:
>>
>>> The GTS helpers do flooring of scale when calculating available scales.
>>> This results available-scales to be reported smaller than they should
>>> when the division in scale computation resulted remainder greater than
>>> half of the divider. (decimal part of result > 0.5)
>>>
>>> Furthermore, when gains are computed based on scale, the gain resulting
>>> from the scale computation is also floored. As a consequence the
>>> floored scales reported by available scales may not match the gains that
>>> can be set.
>>>
>>> The related discussion can be found from:
>>> https://lore.kernel.org/all/84d7c283-e8e5-4c98-835c-fe3f6ff94f4b@gmail.com/
>>>
>>> Do rounding when computing scales and gains.
>>>
>>> Fixes: 38416c28e168 ("iio: light: Add gain-time-scale helpers")
>>> Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
>>
>> Hi Matti,
>>
>> A few questions inline about the maths.
> 
> I appreciate the questions :) Thanks!
>>
>>>
>>> ---
>>> Subjahit, is there any chance you test this patch with your driver? Can
>>> you drop the:
>>>     if (val2 % 10)
>>>         val2 += 1;
>>> from scale setting and do you see written and read scales matching?
>>>
>>> I did run a few Kunit tests on this change - but I'm still a bit jumpy
>>> on it... Reviewing/testing is highly appreciated!
>>>
>>> Just in case someone is interested in seeing the Kunit tests, they're
>>> somewhat unpolished & crude and can emit noisy debug prints - but can
>>> anyways be found from:
>>> https://github.com/M-Vaittinen/linux/commits/iio-gts-helpers-test-v6.6
>>>
>>> ---
>>>   drivers/iio/industrialio-gts-helper.c | 58 +++++++++++++++++++++++----
>>>   1 file changed, 50 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/iio/industrialio-gts-helper.c 
>>> b/drivers/iio/industrialio-gts-helper.c
>>> index 7653261d2dc2..7dc144ac10c8 100644
>>> --- a/drivers/iio/industrialio-gts-helper.c
>>> +++ b/drivers/iio/industrialio-gts-helper.c
>>> @@ -18,6 +18,32 @@
>>>   #include <linux/iio/iio-gts-helper.h>
>>>   #include <linux/iio/types.h>
>>> +static int iio_gts_get_gain_32(u64 full, unsigned int scale)
>>> +{
>>> +    unsigned int full32 = (unsigned int) full;
>>> +    unsigned int rem;
>>> +    int result;
>>> +
>>> +    if (full == (u64)full32) {
>>> +        unsigned int rem;
>>> +
>>> +        result = full32 / scale;
>>> +        rem = full32 - scale * result;
>>> +        if (rem >= scale / 2)
>>> +            result++;
>>> +
>>> +        return result;
>>> +    }
>>> +
>>> +    rem = do_div(full, scale);
>>
>> As below, can we just add scale/2 to full in the do_div?
> 
> The rationale for doing is it in this way is to prevent (theoretical?) 
> overflow when adding scale/2 to full. Maybe this warrants adding a comment?
> 
>>
>>> +    if ((u64)rem >= scale / 2)
>>> +        result = full + 1;
>>> +    else
>>> +        result = full;
>>> +
>>> +    return result;
>>> +}
>>> +
>>>   /**
>>>    * iio_gts_get_gain - Convert scale to total gain
>>>    *
>>> @@ -28,30 +54,42 @@
>>>    *        scale is 64 100 000 000.
>>>    * @scale:    Linearized scale to compute the gain for.
>>>    *
>>> - * Return:    (floored) gain corresponding to the scale. -EINVAL if 
>>> scale
>>> + * Return:    (rounded) gain corresponding to the scale. -EINVAL if 
>>> scale
>>>    *        is invalid.
>>>    */
>>>   static int iio_gts_get_gain(const u64 max, const u64 scale)
>>>   {
>>> -    u64 full = max;
>>> +    u64 full = max, half_div;
>>> +    unsigned int scale32 = (unsigned int) scale;
>>>       int tmp = 1;
>>> -    if (scale > full || !scale)
>>> +    if (scale / 2 > full || !scale)
>>
>> Seems odd. Why are we checking scale / 2 here?
> 
> I am pretty sure I have been thinking of rounding 0.5 to 1.
> 
>>
>>>           return -EINVAL;
>>> +    /*
>>> +     * The loop-based implementation below will potentially run _long_
>>> +     * if we have a small scale and large 'max' - which may be 
>>> needed when
>>> +     * GTS is used for channels returning specific units. Luckily we 
>>> can
>>> +     * avoid the loop when scale is small and fits in 32 bits.
>>> +     */
>>> +    if ((u64)scale32 == scale)
>>> +        return iio_gts_get_gain_32(full, scale32);
>>> +
>>>       if (U64_MAX - full < scale) {
>>>           /* Risk of overflow */
>>> -        if (full - scale < scale)
>>> +        if (full - scale / 2 < scale)
>>>               return 1;
>>>           full -= scale;
>>>           tmp++;
>>>       }
>>> -    while (full > scale * (u64)tmp)
>>> +    half_div = scale >> 2;
>>
>> Why divide by 4?  Looks like classic issue with using shifts for division
>> causing confusion.
> 
> Yes. Looks like a brainfart to me. I need to fire-up my tests and revise 
> this (and the check you asked about above). It seems to take a while 
> from me to wrap my head around this again...
> 
> Thanks for pointing this out!
> 
>>
>>> +
>>> +    while (full + half_div >= scale * (u64)tmp)
>>>           tmp++;

Oh. This is a problem. Adding half_div to full here can cause the scale 
* (u64)tmp to overflow. The overflow-prevention above only ensures full 
is smaller than the U64_MAX - scale. Here we should ensure full + 
half_div is less than U64_MAX - scale to ensure the loop always stops.

All in all, this is horrible. Just ran a quick and dirty test on my 
laptop, and using 0xFFFF FFFF FFFF FFFF as full and 0x1 0000 0000 as 
scale (without the half_div addition) ran this loop for several seconds.

Sigh. My brains jammed. I know this can not be an unique problem. I am 
sure there exists a better solution somewhere - any pointers would be 
appreciated :)

>>> -    return tmp;
>>> +    return tmp - 1;
>>>   }
>>>   /**

Yours,
	-- Matti
Matti Vaittinen Nov. 28, 2023, 1:16 p.m. UTC | #4
On 11/28/23 13:56, Matti Vaittinen wrote:
> On 11/27/23 09:48, Matti Vaittinen wrote:
>> On 11/26/23 19:26, Jonathan Cameron wrote:
>>> On Tue, 31 Oct 2023 11:50:46 +0200
>>> Matti Vaittinen <mazziesaccount@gmail.com> wrote:
>>>
>>>> The GTS helpers do flooring of scale when calculating available scales.
>>>> This results available-scales to be reported smaller than they should
>>>> when the division in scale computation resulted remainder greater than
>>>> half of the divider. (decimal part of result > 0.5)
>>>>
>>>> Furthermore, when gains are computed based on scale, the gain resulting
>>>> from the scale computation is also floored. As a consequence the
>>>> floored scales reported by available scales may not match the gains 
>>>> that
>>>> can be set.
>>>>
>>>> The related discussion can be found from:
>>>> https://lore.kernel.org/all/84d7c283-e8e5-4c98-835c-fe3f6ff94f4b@gmail.com/
>>>>
>>>> Do rounding when computing scales and gains.
>>>>
>>>> Fixes: 38416c28e168 ("iio: light: Add gain-time-scale helpers")
>>>> Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
>>>

...

>>>> +    if ((u64)scale32 == scale)
>>>> +        return iio_gts_get_gain_32(full, scale32);
>>>> +
>>>>       if (U64_MAX - full < scale) {
>>>>           /* Risk of overflow */
>>>> -        if (full - scale < scale)
>>>> +        if (full - scale / 2 < scale)
>>>>               return 1;
>>>>           full -= scale;
>>>>           tmp++;
>>>>       }
>>>> -    while (full > scale * (u64)tmp)
>>>> +    half_div = scale >> 2;
>>>
>>> Why divide by 4?  Looks like classic issue with using shifts for 
>>> division
>>> causing confusion.
>>
>> Yes. Looks like a brainfart to me. I need to fire-up my tests and 
>> revise this (and the check you asked about above). It seems to take a 
>> while from me to wrap my head around this again...
>>
>> Thanks for pointing this out!
>>
>>>
>>>> +
>>>> +    while (full + half_div >= scale * (u64)tmp)
>>>>           tmp++;
> 
> Oh. This is a problem. Adding half_div to full here can cause the scale 
> * (u64)tmp to overflow. The overflow-prevention above only ensures full 
> is smaller than the U64_MAX - scale. Here we should ensure full + 
> half_div is less than U64_MAX - scale to ensure the loop always stops.
> 
> All in all, this is horrible. Just ran a quick and dirty test on my 
> laptop, and using 0xFFFF FFFF FFFF FFFF as full and 0x1 0000 0000 as 
> scale (without the half_div addition) ran this loop for several seconds.
> 
> Sigh. My brains jammed. I know this can not be an unique problem. I am 
> sure there exists a better solution somewhere - any pointers would be 
> appreciated :)
> 

And as a reply to myself - is there something wrong with using the 
div64_u64()? Sorry for the noise...
Jonathan Cameron Dec. 4, 2023, 2:30 p.m. UTC | #5
On Mon, 27 Nov 2023 09:48:08 +0200
Matti Vaittinen <mazziesaccount@gmail.com> wrote:

> On 11/26/23 19:26, Jonathan Cameron wrote:
> > On Tue, 31 Oct 2023 11:50:46 +0200
> > Matti Vaittinen <mazziesaccount@gmail.com> wrote:
> >   
> >> The GTS helpers do flooring of scale when calculating available scales.
> >> This results available-scales to be reported smaller than they should
> >> when the division in scale computation resulted remainder greater than
> >> half of the divider. (decimal part of result > 0.5)
> >>
> >> Furthermore, when gains are computed based on scale, the gain resulting
> >> from the scale computation is also floored. As a consequence the
> >> floored scales reported by available scales may not match the gains that
> >> can be set.
> >>
> >> The related discussion can be found from:
> >> https://lore.kernel.org/all/84d7c283-e8e5-4c98-835c-fe3f6ff94f4b@gmail.com/
> >>
> >> Do rounding when computing scales and gains.
> >>
> >> Fixes: 38416c28e168 ("iio: light: Add gain-time-scale helpers")
> >> Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>  
> > 
> > Hi Matti,
> > 
> > A few questions inline about the maths.  
> 
> I appreciate the questions :) Thanks!

I found some emails hiding so late replies...
> >   
> >>
> >> ---
> >> Subjahit, is there any chance you test this patch with your driver? Can
> >> you drop the:
> >> 	if (val2 % 10)
> >> 		val2 += 1;
> >> from scale setting and do you see written and read scales matching?
> >>
> >> I did run a few Kunit tests on this change - but I'm still a bit jumpy
> >> on it... Reviewing/testing is highly appreciated!
> >>
> >> Just in case someone is interested in seeing the Kunit tests, they're
> >> somewhat unpolished & crude and can emit noisy debug prints - but can
> >> anyways be found from:
> >> https://github.com/M-Vaittinen/linux/commits/iio-gts-helpers-test-v6.6
> >>
> >> ---
> >>   drivers/iio/industrialio-gts-helper.c | 58 +++++++++++++++++++++++----
> >>   1 file changed, 50 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/drivers/iio/industrialio-gts-helper.c b/drivers/iio/industrialio-gts-helper.c
> >> index 7653261d2dc2..7dc144ac10c8 100644
> >> --- a/drivers/iio/industrialio-gts-helper.c
> >> +++ b/drivers/iio/industrialio-gts-helper.c
> >> @@ -18,6 +18,32 @@
> >>   #include <linux/iio/iio-gts-helper.h>
> >>   #include <linux/iio/types.h>
> >>   
> >> +static int iio_gts_get_gain_32(u64 full, unsigned int scale)
> >> +{
> >> +	unsigned int full32 = (unsigned int) full;
> >> +	unsigned int rem;
> >> +	int result;
> >> +
> >> +	if (full == (u64)full32) {
> >> +		unsigned int rem;
> >> +
> >> +		result = full32 / scale;
> >> +		rem = full32 - scale * result;
> >> +		if (rem >= scale / 2)
> >> +			result++;
> >> +
> >> +		return result;
> >> +	}
> >> +
> >> +	rem = do_div(full, scale);  
> > 
> > As below, can we just add scale/2 to full in the do_div?  
> 
> The rationale for doing is it in this way is to prevent (theoretical?) 
> overflow when adding scale/2 to full. Maybe this warrants adding a comment?

Hmm. Chances are very low of hitting that.  I'd just go with adding scale/2
before the div.  If you really want to worry about being right at the edge
of available precision, then add a check for that.


> 
> >   
> >> +	if ((u64)rem >= scale / 2)
> >> +		result = full + 1;
> >> +	else
> >> +		result = full;
> >> +
> >> +	return result;
> >> +}
> >> +
> >>   /**
> >>    * iio_gts_get_gain - Convert scale to total gain
> >>    *
> >> @@ -28,30 +54,42 @@
> >>    *		scale is 64 100 000 000.
> >>    * @scale:	Linearized scale to compute the gain for.
> >>    *
> >> - * Return:	(floored) gain corresponding to the scale. -EINVAL if scale
> >> + * Return:	(rounded) gain corresponding to the scale. -EINVAL if scale
> >>    *		is invalid.
> >>    */
> >>   static int iio_gts_get_gain(const u64 max, const u64 scale)
> >>   {
> >> -	u64 full = max;
> >> +	u64 full = max, half_div;
> >> +	unsigned int scale32 = (unsigned int) scale;
> >>   	int tmp = 1;
> >>   
> >> -	if (scale > full || !scale)
> >> +	if (scale / 2 > full || !scale)  
> > 
> > Seems odd. Why are we checking scale / 2 here?  
> 
> I am pretty sure I have been thinking of rounding 0.5 to 1.

Not sure I follow - but maybe it'll be clear in v2.

> >   
> >> +
> >> +	while (full + half_div >= scale * (u64)tmp)
> >>   		tmp++;
> >>   
> >> -	return tmp;
> >> +	return tmp - 1;
> >>   }
> >>   
> >>   /**
> >> @@ -133,6 +171,7 @@ static int iio_gts_linearize(int scale_whole, int scale_nano,
> >>    * Convert the total gain value to scale. NOTE: This does not separate gain
> >>    * generated by HW-gain or integration time. It is up to caller to decide what
> >>    * part of the total gain is due to integration time and what due to HW-gain.
> >> + * Computed gain is rounded to nearest integer.
> >>    *
> >>    * Return: 0 on success. Negative errno on failure.
> >>    */
> >> @@ -140,10 +179,13 @@ int iio_gts_total_gain_to_scale(struct iio_gts *gts, int total_gain,
> >>   				int *scale_int, int *scale_nano)
> >>   {
> >>   	u64 tmp;
> >> +	int rem;
> >>   
> >>   	tmp = gts->max_scale;
> >>   
> >> -	do_div(tmp, total_gain);
> >> +	rem = do_div(tmp, total_gain);  
> > 
> > can we do usual trick of
> > do_div(tmp + total_gain/2, total_gain)
> > to get the same rounding effect?  
> 
> Only if we don't care about the case where tmp + total_gain/2 overflows.

As above. The cases where that happens are pretty narrow.  I'd not worry about it
or I'd check for that overflow.

> 
> >   
> >> +	if (total_gain > 1 && rem >= total_gain / 2)
> >> +		tmp += 1ULL;
> >>   
> >>   	return iio_gts_delinearize(tmp, NANO, scale_int, scale_nano);
> >>   }
> >> @@ -192,7 +234,7 @@ static int gain_to_scaletables(struct iio_gts *gts, int **gains, int **scales)
> >>   		sort(gains[i], gts->num_hwgain, sizeof(int), iio_gts_gain_cmp,
> >>   		     NULL);
> >>   
> >> -		/* Convert gains to scales */
> >> +		/* Convert gains to scales. */  
> > 
> > Grumble - unrelated change.  
> 
> Yes. I'll drop this.
> 
> >   
> >>   		for (j = 0; j < gts->num_hwgain; j++) {
> >>   			ret = iio_gts_total_gain_to_scale(gts, gains[i][j],
> >>   							  &scales[i][2 * j],
> >>
> >> base-commit: ffc253263a1375a65fa6c9f62a893e9767fbebfa  
> 
> All in all, I am still not 100% sure if rounding is the right ambition. 
> Do we cause hidden accuracy issues by doing the rounding under the hood? 
> I feel I need bigger brains :)
Don't we all!

Jonathan

> 
> Yours,
> 	-- Matti
> 
>
Matti Vaittinen Dec. 5, 2023, 7:10 a.m. UTC | #6
On 12/4/23 16:30, Jonathan Cameron wrote:
> On Mon, 27 Nov 2023 09:48:08 +0200
> Matti Vaittinen <mazziesaccount@gmail.com> wrote:
> 
>> On 11/26/23 19:26, Jonathan Cameron wrote:
>>> On Tue, 31 Oct 2023 11:50:46 +0200
>>> Matti Vaittinen <mazziesaccount@gmail.com> wrote:
>>>    
>>>> The GTS helpers do flooring of scale when calculating available scales.
>>>> This results available-scales to be reported smaller than they should
>>>> when the division in scale computation resulted remainder greater than
>>>> half of the divider. (decimal part of result > 0.5)
>>>>
>>>> Furthermore, when gains are computed based on scale, the gain resulting
>>>> from the scale computation is also floored. As a consequence the
>>>> floored scales reported by available scales may not match the gains that
>>>> can be set.
>>>>
>>>> The related discussion can be found from:
>>>> https://lore.kernel.org/all/84d7c283-e8e5-4c98-835c-fe3f6ff94f4b@gmail.com/
>>>>
>>>> Do rounding when computing scales and gains.
>>>>
>>>> Fixes: 38416c28e168 ("iio: light: Add gain-time-scale helpers")
>>>> Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
>>>
>>> Hi Matti,
>>>
>>> A few questions inline about the maths.
>>
>> I appreciate the questions :) Thanks!
> 
> I found some emails hiding so late replies...

Better late than never :)

To tell the truth, delays have been Ok. I think Subhajit has not needed 
this urgently and the darkness of the winter in Finland has hindered my 
energy and activity to very low levels.

>>>> ---
>>>> Subjahit, is there any chance you test this patch with your driver? Can
>>>> you drop the:
>>>> 	if (val2 % 10)
>>>> 		val2 += 1;
>>>> from scale setting and do you see written and read scales matching?
>>>>
>>>> I did run a few Kunit tests on this change - but I'm still a bit jumpy
>>>> on it... Reviewing/testing is highly appreciated!
>>>>
>>>> Just in case someone is interested in seeing the Kunit tests, they're
>>>> somewhat unpolished & crude and can emit noisy debug prints - but can
>>>> anyways be found from:
>>>> https://github.com/M-Vaittinen/linux/commits/iio-gts-helpers-test-v6.6
>>>>
>>>> ---
>>>>    drivers/iio/industrialio-gts-helper.c | 58 +++++++++++++++++++++++----
>>>>    1 file changed, 50 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/iio/industrialio-gts-helper.c b/drivers/iio/industrialio-gts-helper.c
>>>> index 7653261d2dc2..7dc144ac10c8 100644
>>>> --- a/drivers/iio/industrialio-gts-helper.c
>>>> +++ b/drivers/iio/industrialio-gts-helper.c
>>>> @@ -18,6 +18,32 @@
>>>>    #include <linux/iio/iio-gts-helper.h>
>>>>    #include <linux/iio/types.h>
>>>>    
>>>> +static int iio_gts_get_gain_32(u64 full, unsigned int scale)
>>>> +{
>>>> +	unsigned int full32 = (unsigned int) full;
>>>> +	unsigned int rem;
>>>> +	int result;
>>>> +
>>>> +	if (full == (u64)full32) {
>>>> +		unsigned int rem;
>>>> +
>>>> +		result = full32 / scale;
>>>> +		rem = full32 - scale * result;
>>>> +		if (rem >= scale / 2)
>>>> +			result++;
>>>> +
>>>> +		return result;
>>>> +	}
>>>> +
>>>> +	rem = do_div(full, scale);
>>>
>>> As below, can we just add scale/2 to full in the do_div?
>>
>> The rationale for doing is it in this way is to prevent (theoretical?)
>> overflow when adding scale/2 to full. Maybe this warrants adding a comment?
> 
> Hmm. Chances are very low of hitting that.  I'd just go with adding scale/2
> before the div.  If you really want to worry about being right at the edge
> of available precision, then add a check for that.

I think the v2 will ditch this function.
>>>> +	if ((u64)rem >= scale / 2)
>>>> +		result = full + 1;
>>>> +	else
>>>> +		result = full;
>>>> +
>>>> +	return result;
>>>> +}
>>>> +
>>>>    /**
>>>>     * iio_gts_get_gain - Convert scale to total gain
>>>>     *
>>>> @@ -28,30 +54,42 @@
>>>>     *		scale is 64 100 000 000.
>>>>     * @scale:	Linearized scale to compute the gain for.
>>>>     *
>>>> - * Return:	(floored) gain corresponding to the scale. -EINVAL if scale
>>>> + * Return:	(rounded) gain corresponding to the scale. -EINVAL if scale
>>>>     *		is invalid.
>>>>     */
>>>>    static int iio_gts_get_gain(const u64 max, const u64 scale)
>>>>    {
>>>> -	u64 full = max;
>>>> +	u64 full = max, half_div;
>>>> +	unsigned int scale32 = (unsigned int) scale;
>>>>    	int tmp = 1;
>>>>    
>>>> -	if (scale > full || !scale)
>>>> +	if (scale / 2 > full || !scale)
>>>
>>> Seems odd. Why are we checking scale / 2 here?
>>
>> I am pretty sure I have been thinking of rounding 0.5 to 1.
> 
> Not sure I follow - but maybe it'll be clear in v2.

Basically, when scale is greater than max, the division yields values 
smaller than 1. So, when we do rounding, everything equal to or greater 
than 0.5 and smaller than 1 should be rounded upwards. Eg, purely from 
computational perspective, when the "full" is half of the scale, 
division returns 0.5. Thus the check.

But I think your question is very much a valid one. By design the driver 
gives the max value - and I think that scale exceeding this maximum can 
indeed be considered to be invalid. Not that I feel 100% certain on that 
Today :)

> 
>>>    
>>>> +
>>>> +	while (full + half_div >= scale * (u64)tmp)
>>>>    		tmp++;
>>>>    
>>>> -	return tmp;
>>>> +	return tmp - 1;
>>>>    }
>>>>    
>>>>    /**
>>>> @@ -133,6 +171,7 @@ static int iio_gts_linearize(int scale_whole, int scale_nano,
>>>>     * Convert the total gain value to scale. NOTE: This does not separate gain
>>>>     * generated by HW-gain or integration time. It is up to caller to decide what
>>>>     * part of the total gain is due to integration time and what due to HW-gain.
>>>> + * Computed gain is rounded to nearest integer.
>>>>     *
>>>>     * Return: 0 on success. Negative errno on failure.
>>>>     */
>>>> @@ -140,10 +179,13 @@ int iio_gts_total_gain_to_scale(struct iio_gts *gts, int total_gain,
>>>>    				int *scale_int, int *scale_nano)
>>>>    {
>>>>    	u64 tmp;
>>>> +	int rem;
>>>>    
>>>>    	tmp = gts->max_scale;
>>>>    
>>>> -	do_div(tmp, total_gain);
>>>> +	rem = do_div(tmp, total_gain);
>>>
>>> can we do usual trick of
>>> do_div(tmp + total_gain/2, total_gain)
>>> to get the same rounding effect?
>>
>> Only if we don't care about the case where tmp + total_gain/2 overflows.
> 
> As above. The cases where that happens are pretty narrow.  I'd not worry about it
> or I'd check for that overflow.

part of me says you're right while part of me screams that
1) a _division_ causing overflow is against all that is well and good.
2) if we can cope with the overflow, then we should cope with it.

I am very much undecided what is the best approach here. I'll see how 
much clarity there is in the v2 code, what comments can do and then I'll 
throw it for you to judge :)

>>
>> All in all, I am still not 100% sure if rounding is the right ambition.
>> Do we cause hidden accuracy issues by doing the rounding under the hood?
>> I feel I need bigger brains :)
> Don't we all!

Well, luckily the software development can be seen as an iterative 
process :)

Yours,
	-- Matti
diff mbox series

Patch

diff --git a/drivers/iio/industrialio-gts-helper.c b/drivers/iio/industrialio-gts-helper.c
index 7653261d2dc2..7dc144ac10c8 100644
--- a/drivers/iio/industrialio-gts-helper.c
+++ b/drivers/iio/industrialio-gts-helper.c
@@ -18,6 +18,32 @@ 
 #include <linux/iio/iio-gts-helper.h>
 #include <linux/iio/types.h>
 
+static int iio_gts_get_gain_32(u64 full, unsigned int scale)
+{
+	unsigned int full32 = (unsigned int) full;
+	unsigned int rem;
+	int result;
+
+	if (full == (u64)full32) {
+		unsigned int rem;
+
+		result = full32 / scale;
+		rem = full32 - scale * result;
+		if (rem >= scale / 2)
+			result++;
+
+		return result;
+	}
+
+	rem = do_div(full, scale);
+	if ((u64)rem >= scale / 2)
+		result = full + 1;
+	else
+		result = full;
+
+	return result;
+}
+
 /**
  * iio_gts_get_gain - Convert scale to total gain
  *
@@ -28,30 +54,42 @@ 
  *		scale is 64 100 000 000.
  * @scale:	Linearized scale to compute the gain for.
  *
- * Return:	(floored) gain corresponding to the scale. -EINVAL if scale
+ * Return:	(rounded) gain corresponding to the scale. -EINVAL if scale
  *		is invalid.
  */
 static int iio_gts_get_gain(const u64 max, const u64 scale)
 {
-	u64 full = max;
+	u64 full = max, half_div;
+	unsigned int scale32 = (unsigned int) scale;
 	int tmp = 1;
 
-	if (scale > full || !scale)
+	if (scale / 2 > full || !scale)
 		return -EINVAL;
 
+	/*
+	 * The loop-based implementation below will potentially run _long_
+	 * if we have a small scale and large 'max' - which may be needed when
+	 * GTS is used for channels returning specific units. Luckily we can
+	 * avoid the loop when scale is small and fits in 32 bits.
+	 */
+	if ((u64)scale32 == scale)
+		return iio_gts_get_gain_32(full, scale32);
+
 	if (U64_MAX - full < scale) {
 		/* Risk of overflow */
-		if (full - scale < scale)
+		if (full - scale / 2 < scale)
 			return 1;
 
 		full -= scale;
 		tmp++;
 	}
 
-	while (full > scale * (u64)tmp)
+	half_div = scale >> 2;
+
+	while (full + half_div >= scale * (u64)tmp)
 		tmp++;
 
-	return tmp;
+	return tmp - 1;
 }
 
 /**
@@ -133,6 +171,7 @@  static int iio_gts_linearize(int scale_whole, int scale_nano,
  * Convert the total gain value to scale. NOTE: This does not separate gain
  * generated by HW-gain or integration time. It is up to caller to decide what
  * part of the total gain is due to integration time and what due to HW-gain.
+ * Computed gain is rounded to nearest integer.
  *
  * Return: 0 on success. Negative errno on failure.
  */
@@ -140,10 +179,13 @@  int iio_gts_total_gain_to_scale(struct iio_gts *gts, int total_gain,
 				int *scale_int, int *scale_nano)
 {
 	u64 tmp;
+	int rem;
 
 	tmp = gts->max_scale;
 
-	do_div(tmp, total_gain);
+	rem = do_div(tmp, total_gain);
+	if (total_gain > 1 && rem >= total_gain / 2)
+		tmp += 1ULL;
 
 	return iio_gts_delinearize(tmp, NANO, scale_int, scale_nano);
 }
@@ -192,7 +234,7 @@  static int gain_to_scaletables(struct iio_gts *gts, int **gains, int **scales)
 		sort(gains[i], gts->num_hwgain, sizeof(int), iio_gts_gain_cmp,
 		     NULL);
 
-		/* Convert gains to scales */
+		/* Convert gains to scales. */
 		for (j = 0; j < gts->num_hwgain; j++) {
 			ret = iio_gts_total_gain_to_scale(gts, gains[i][j],
 							  &scales[i][2 * j],